Cracks in the System: Rising spam, deepfakes and data misuse question the ethics of AI -

India’s artificial intelligence (AI) moment has clearly arrived. The government’s IndiaAI Mission is under way. Efforts to develop indigenous language models, enforce stronger data governance and manufacture home-grown chips are gathering steam. Start-ups are experimenting with AI across sectors. Tech companies are embedding AI into everyday hardware. As AI adoption accelerates, its ethical blind spots are becoming harder to ignore. Against this backdrop, one critical question remains − can India build AI that is also fair, secure and inclusive?

Evolving threat landscape in India

The catch in AI-powered spam detection

AI-powered spam detection is becoming a standard tool for telecom companies and digital platforms, helping them filter harmful or unwanted content at scale. However, its effectiveness remains uneven. At times, filters can overreach or misfire, reflecting the broader blind spots of AI systems. There is often no way to correct a mislabelled call or message − no room for context. Take insurance calls, for example. For some, they are pure spam, unrequested and relentless. For others, especially those actively managing a policy or filing a claim, the same calls may be essential. This issue plays out across every business that relies on outbound calls to reach potential or existing customers. Therefore, instead of treating spam as a purely technical issue, companies should train AI on more contextual data, such as communication history and user preferences, to reduce the risk of misclassification.

Generative AI and newer vulnerabilities

Filtering spam used to be a technical problem. With generative AI, it is becoming a social and psychological one too. Deepfake political speeches, synthetic identity fraud and AI-generated voice scams have already begun to surface in India. Deepfake-related cases in India have surged by 550 per cent since 2019, with projected losses hitting Rs 700 billion in 2024.

But the real shift is in believability. Traditional spam filters, which rely on keywords or known patterns, are no match for AI-generated scams that adapt and evolve rapidly. These messages often sound natural, human and credible. By the time detection models catch up, the scam tactics have already changed. India’s digital vulnerabilities add to the problem. Many users still bypass official app stores, use cracked software, or have limited awareness of cybersecurity basics such as firewalls or antivirus tools. Advanced keyloggers can now interpret multilingual keystrokes, which makes Indian users especially vulnerable. Once inside a system, attackers often escalate to blackmail or extortion. This poses a serious challenge for both platforms and policymakers.

India needs a coordinated public awareness campaign across languages and platforms, warning users about voice cloning, deepfakes and emotionally manipulative fraud. In addition, platforms should be mandated to watermark AI-generated media, something international regulators such as the across sectors. and China are already pushing for.

AI at the hardware level

Conversations around ethical AI often focus on software, but increasingly the groundwork is being laid at the hardware level. Modern chips now come with dedicated AI accelerators, such as neural or tensor processing units, which accelerate AI/machine learning functions on devices. This is what enables a phone to unlock using facial recognition in low light or translate messages instantly. It is fast, efficient and increasingly invisible. But that very invisibility raises important questions. These chips decide how much data stays local, whether cloud is needed for biometric processing, and how securely that information is handled. Most users never see these choices being made, yet they directly shape how AI behaves and whether it respects user privacy, autonomy and consent.

To address this concern, smartphone makers and chipset manufacturers should clearly disclose when on-device AI features, especially those involving biometrics such as facial recognition or voice input, are active. Users should be able to easily access privacy settings that let them opt in or out of these features, including offline processing, cloud syncing and data sharing. Some of these controls already exist, but they need to be made more transparent, accessible and mandatory by design.

Data gaps and risks

Ethical AI begins with ethical data – who collects it, who controls it and who benefits from its use. In India, that foundation remains deeply flawed. For years, personal data has been harvested with vague or coerced consent. For instance, across the country, users of fintech platforms have been nudged or even tricked into sharing sensitive information without fully understanding how it is being stored or shared. With AI now in play, those risks are scaling rapidly. According to an industry report, in 2024 alone, India recorded digital fraud losses of over Rs 228.12 billion, with a significant share linked to AI-driven scams.

The problem runs deeper. AI systems are only as fair or accountable as the data they are trained on. Yet in India, data-sets are often scraped from social media, online forums and government databases, usually without disclosure, permission, or informed consent. The recent public warning from the Cyber Crime Wing in Chennai highlights the stakes. Ghibli AI Art generators, which convert user selfies into stylised animations, may appear harmless. But they often collect and store biometric data without transparency, leaving users exposed to deepfakes, identity theft and other online threats.

This issue extends beyond consumer apps. AI tools such as ChatGPT, managed by private, foreign-owned companies, operate largely outside government oversight. While there is no ban on their use by government departments, official guidelines urge public servants to exercise caution and ensure the confidentiality of sensitive data. But without enforceable rules or transparency on how cloud-based AI tools process and store information, the risks to public data security remain unresolved.

The Digital Personal Data Protection Act is a step in the right direction, but its enforcement requires teeth. That includes imposing real penalties for misuse and creating public-facing dashboards to show which companies are compliant. Consent interfaces should be simplified across apps using standardised formats instead of dense legalese.

Language bias in AI models

India is home to hundreds of languages and dialects, but most AI models are built on data that barely reflects the country’s vast linguistic and cultural diversity. For example, while Hindi and Bengali have some online presence, languages like Bodo, Konkani and Santali remain almost invisible in digitised form. Even widely spoken languages such as Punjabi and Marathi lack comprehensive, diverse corpora, such as legal, technical, or scientific texts, limiting their usefulness for training AI models. This creates a deeper problem. Indian users frequently switch between languages mid-sentence, a practice known as code-switching, and use words that change meaning based on context, tone, or region. Most AI models, built on monolingual data sets, are simply ill-equipped to handle this complexity. For resource-poor languages like Marathi, the absence of annotated data sets for code-switching makes it even harder for AI to grasp real-world usage. This results in robotic translations, context-blind recommendations, and tools that fail to engage meaningfully with users in their native tongues. These are not just technical gaps, they are barriers to access. When AI cannot understand a user, it effectively excludes them from the digital conversation.

To bridge this gap, the Indian government launched the Bhashini project under the Digital India initiative. Its goal is to develop AI-powered translation and speech recognition tools for all 22 scheduled languages and several major dialects. The National Language Translation Mission aims to make educational content and government services accessible in regional languages through AI. Tech companies and research institutions are also stepping in. Native speakers are being encouraged to contribute speech and text samples. Institutions like IIT Madras and IIIT Hyderabad are developing models tailored to the linguistic realities of Indian users.

Bias in facial recognition

The representation gap extends beyond language. Globally, facial recognition systems have been shown to perform significantly worse on darker-skinned individuals, particularly women. As per an industry study, the error rates are as high as 34.7 per cent for darker-skinned women in commercial facial recognition systems. These visual biases are not limited to recognition, they show up in image generation, too. This is a safety issue for India. When facial recognition is used in policing, surveillance, or welfare distribution, errors can have serious consequences, such as wrongful detention, denial of entitlements, or exclusion from critical services. The root of the problem lies in flawed training data. When systems are trained primarily on lighter-skinned, male faces, everyone else becomes a statistical edge case.

Any AI used in public services, especially policing or surveillance, must be tested on data sets that reflect India’s full demographic diversity. Until such systems are proven to work equally well for all, their deployment in high-stakes environments should be paused.

In sum

India stands at a pivotal crossroads. On one side is a booming digital economy, world-class AI talent, ambitious government projects and a surge of real-world use cases. On the other is a growing list of unresolved ethical concerns around consent, bias, access and accountability. These two sides do not have to be at odds. In fact, the only way to build AI that is truly scalable and inclusive is to make ethics the starting point and not a patch applied after the damage is done. That means designing spam filters that are fair, building data sets that reflect India’s full diversity, embedding transparency into chip design, and hardwiring accountability into regulation.

Globally, ethical AI is gaining attention. But India has a chance to do something more − to lead with a model that is multilingual, decentralised, deeply aware of social context and built for real-world complexity.

India’s AI future depends not just on innovation, but on confronting the uncomfortable truths about how data is collected, handled and protected. Without that reckoning, ethical AI will remain more rhetoric than reality.