The emergence of generative AI, including ChatGPT, has dramatically accelerated AI adoption in enterprises. Many organizations are considering AI implementation to improve operational efficiency and advance data utilization capabilities. Today, many professionals find AI indispensable in their daily work.
However, concerns about "Is it safe to use internal data with AI?" are frequently preventing implementation, particularly among executive leadership and IT departments.
Indeed, there have been cases where confidential data input into generative AI has leaked externally, and handling data containing personal information requires consideration not only of internal policies but also global regulatory compliance.
This article outlines a practical approach for safely utilizing generative AI and data analytics platforms while adapting to regulations including those in India, assuming AI adoption as a premise, along with the support measures we provide.
The most commonly cited concern regarding AI implementation is "information leak risk." External services like generative AI may store input information on service providers' systems, potentially placing it beyond organizational control.
Previously, a company experienced an incident where, while internal business efficiency improvements using ChatGPT were progressing, an employee mistakenly input code and customer information into the AI, resulting in external data leakage. While AI is convenient, its "once input, irretrievable" nature means minor operational errors can lead to major incidents.
Such risks directly impact business credibility and customer relationships. In case of leakage, business suspension or legal liability could affect business continuity itself.
Additionally, global strengthening of personal information protection regulations cannot be ignored. Beyond EU's GDPR, India enacted the DPDP Act in 2023, mandating protection of "digital personal data" handled by companies, with violations potentially resulting in fines up to 2.5 billion rupees (approximately 4.5 billion yen).
AI adoption is no longer merely about technology implementation, but has become a theme that combines "law, ethics, and management decisions."
How can we safely utilize AI? The first requirement is "labeling internal data according to importance levels and clearly defining what can and cannot be input into AI."
For example, the following classifications are necessary:
Establishing such rules company-wide and thoroughly educating employees is the first step.
India's DPDP Act particularly requires data utilization based on "clear consent from individuals," which can be stricter than EU's GDPR. For businesses targeting local markets, establishing regulatory compliance and risk mitigation mechanisms is essential.
However, stopping AI utilization altogether due to risk concerns would be counterproductive. The key is determining where to balance "securing defenses while pursuing aggressive utilization." Creating judgment criteria and mechanisms within the organization is a management issue for the future.
To address these challenges, we’ve built a solution that consists of three pillars:
1 - Regulation-Based Data Labeling
Based on standards required by various countries including GDPR, Japan's Personal Information Protection Act, and India's DPDP Act, we comprehensively classify internal data into five sensitivity levels. Sensitive personal information and specific personal information can be strictly detected and visualized, serving as standards for future data protection.
With the setup completed all at once, it's optimal for departments wanting to begin AI utilization immediately.
2 - Detection of Combinatorial Storage Risks
To detect storage risks where combining different data could lead to individual identification, we conduct cross-sectional analysis of data within tables and perform set determinations.
3 - Automatic Generation and Assignment of Metadata
After ensuring secure personal information management through processes 1 and 2 above and creating a foundation for AI utilization, the next requirement is creating metadata for each dataset and organizing data that AI can easily understand and utilize.
Even when table names don't clearly indicate content or column names are abstract, AI can only accurately understand data once metadata (explanations of what the data means) indicating "what the data represents" is assigned. This enables AI to analyze and aggregate data with higher accuracy.
Assigning metadata to proprietary data is a crucial process for maintaining strong defenses while pursuing AI utilization. While the process of manually reviewing each dataset and assigning metadata is typically very labor-intensive, our solution achieves automatic metadata generation.
This solution enables various stakeholders to realize the following value:
Digital transformation (DX) premised on AI utilization is a source of competitive advantage in future business. However, realistic advancement of AI utilization requires proper management and organization of proprietary data, specifically "personal information protection" compliant with regulations and "metadata management" premised on AI utilization.
We propose a practical approach that assumes AI utilization while understanding and classifying data risks and implementing appropriate measures where needed.
Rather than "AI is dangerous and unusable," we think "how can we make it usable" and realize it.
We invite those interested in our solution to apply for a detailed service introduction through the form below.
Let's advance DX together while protecting both company growth and customer trust.