đź‘‹ Welcome!
As part of AI for Privacy, we developed “Efficient Detection of Personally Identifiable Information (PII) in Prompts Leveraging Synthetic Data Generation.” Using zero-shot and few-shot approaches, synthetic data was generated to fine-tune the Distil-BERT model for effective PII detection, thereby strengthening data privacy in AI systems.
- Personally Identifiable Information (PII) refers to any data that can be used to uniquely identify, contact, or locate an individual. This includes direct identifiers such as names, email addresses, phone numbers, and government-issued IDs, as well as indirect identifiers like date of birth, location data, or demographic details when combined. Protecting PII is critical in AI and data-driven systems to prevent unauthorized access, misuse, or exposure of sensitive personal information, ensuring compliance with privacy regulations and ethical standards.