The Indian Railways (IRCTC) made headlines earlier this week when news broke that it has invited bids from firms for recommendations on how to monetize passenger data. While the Railways has since twice denied reports, calling them ‘fictious’, the tender document can be viewed on the firm’s website. The denial and the document are at odds with each other.
Conversations with people at the Ministry of Electronics & Information Technology (MeITY) suggest much is going on in the background. It appears the intent is to position India as the World’s Artificial Intelligence (AI) Model Making Capital. And they don’t rule out the possibility that IRCTC, though now it has stalled the idea, wanted to be seen as among those with a “first-mover advantage” along with pharma giants, e-commerce firms and fintech companies pushing the boundaries of technology.
The background to this is that a Personal Data Protection (PDP) Bill for India was in the works for close to a decade. It was first introduced in the Lok Sabha in December 2019. But it was withdrawn abruptly earlier this month and MeITY issued a statement that an amended Bill will be introduced soon. This Bill, in its original avatar, recognises privacy is a fundamental right. Speaking off-the-record, those at work on the Bill claim this right will be upheld. And that a new version of the amended Bill will be tabled in Parliament by December this year.
Before getting into the amendments, what got people worked up about IRCTC’s proposals?
The first is that the scope of work defined in the document clearly includes examining personal data. This data includes names of people, their age, cellular numbers, gender, address, e-Mail ids, login ids and passwords among others. This is a privacy violation. Ironically, the document also has it that the selected firm “Shall study various Acts or laws including IT Act 2000 and its amendments, user data privacy laws including GDPR (General Data Protection Regulation) and current ‘Personal Data Protection Bill 2018 of India, and accordingly propose the business models for monetization of Digital Assets.”
But those at work argue there is nothing “extraordinary” here. By way of examples, they point out that world over, businesses that deal with data, operate in a fuzzy zone. Fintech firms deploy AI to scrutinise personal data so they can decide whether to lend or not. Hospitals use medical scans to predict outcomes of diseases such as cancer. E-commerce websites and browsers store data to serve up personalised ads. And how do you ignore that the worst offenders include browsers such as Google’s Chrome and Amazon’s personal assistant Alexa? All these entities argue it means better outcomes for all stakeholders. But does it?
Let’s take medical data, for instance. Someone may have provided their consent to offer their personal data for diagnosis. This is primary data. But they may not have offered consent to use their data for cancer research (secondary data)— even if it is anonymised. A pharma firm can argue that consent to collect data implies it can be used as primary and secondary data. Not just that, studying such data is how medical knowledge grows over time. This is the same argument fintech firms, e-commerce websites and internet browsers use.
The counter argument here is that secondary data is monetized. The silos where this data resides earn the entities that hold it large sums of monies. And over time, as their silos grows, they learn to make better models with better outcomes. And earn more.
This raises a thorny question: What’s in it for someone whose primary data is being deployed as secondary data? To get around this, a new experiment tentatively called the “Differential Privacy Model” is being tested across India. It could not be independently confirmed if IRCTC is one of the test cases. But what could be confirmed is that work is on to create “Bio Banks”. These are places where large samples of medical data, tissue samples and genetic data is stored. India is an ideal place to do it because of its diverse population.
A use case? Pharma companies can use their models at such banks to test the efficacy of drugs under development on various samples.
As things are, to build a model that understands the difference between a cat and a dog, for instance, AI must pore over images that reside in a database. In the proposed scheme, the model is first trained to understand the differences by feeding it with labelled images of cats and dogs.
For the model to get better, it can inspect databases where such images exist. It is “Computationally Guaranteed” by an agency that no personally identifiable information was used in a “Certified Clean Room” and that the model works. This model can be extrapolated to other ecosystems such as fintech, medicine and e-commerce as well and the potential begins to make itself obvious.
An additional layer being worked on is that any entity that wishes to access these databases must pay a fee. Eventually, these fees will be distributed as royalties to people whose secondary data resides in the databases.
So, hypothetically, if IRCTC’s database were to be used by an entity to refine its model, then the monies paid to IRCTC must be distributed among the people whose data it touched as well. One way to do this is to offer discount coupons, for instance, to book train travel. Or vouchers to buy meals on long distance trains.
While this sounds wildly ingenious and ambitious, it raises multiple questions as well that Indian policy makers must deal with. To begin with, if a Biobank is financially lucrative, how do you safeguard the financially vulnerable who wouldn’t mind giving up tissues and privacy for a few pennies? We’ll wait until the year-end for more details.
Stay connected with us on social media platform for instant update click here to join our Twitter, & Facebook
We are now on Telegram. Click here to join our channel (@TechiUpdate) and stay updated with the latest Technology headlines.
For all the latest For Top Stories News Click Here