Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save thedavidyoungblood/5a173516e1c36ec4020bb062d08a9705 to your computer and use it in GitHub Desktop.

Select an option

Save thedavidyoungblood/5a173516e1c36ec4020bb062d08a9705 to your computer and use it in GitHub Desktop.
FORENSIC ANALYSIS - MOONSHOT AI - KIMI - KIMI2 DATA SOVEREIGNTY & PRIVACY ASSESSMENT.md

TL;DR

DO YOUR OWN DUE DILIGENCE, and... if privacy matters to YOU, then, read below, and choose how you wish to further engage in usage of KIMI, (and derivatives thereof, and/or other MoonshotAI products/services.) -- THIS CONTENT was analyzed straight from their platform/site. (via Claude.)


πŸ”΄ CRITICAL FORENSIC ANALYSIS: MOONSHOT AI (KIMI) DATA SOVEREIGNTY & PRIVACY ASSESSMENT

EXECUTIVE THREAT SUMMARY

SEVERITY: HIGH-RISK ⚠️

After exhaustive line-by-line forensic analysis of all four legal documents, I must advise that Moonshot AI's terms create SUBSTANTIAL DATA EXPOSURE that CANNOT be fully mitigated if you use their services. Here are the critical findings:


🚨 CRITICAL RISK VECTORS

1. EXPLICIT TRAINING DATA USAGE (CONSUMER SERVICES)

Document 1 (Consumer ToS), Section 3:

"We may use your Content to operate, maintain, improve, and develop the Services"

Document 2 (Consumer Privacy), Section 2:

"We analyze usage data and user interactions to better understand how our Services are used... This may also include training and improving our underlying models, algorithms, and user interfaces."

Document 2, Section 2 (continued):

"Where required by law, we obtain your consent for such purposes; otherwise, we rely on our legitimate interests in enhancing our Services"

πŸ”΄ CRITICAL FINDING:

  • Your input/output content WILL be used for model training by default
  • Opt-out is mentioned BUT buried: "You may opt out... by contacting us at [email protected]"
  • Opt-out is NOT guaranteed - they say "we will honor your choice in accordance with applicable law"
  • They claim "legitimate interests" as legal basis in non-EU jurisdictions, bypassing explicit consent

2. BUSINESS/API SERVICES: DECEPTIVE "IMPROVEMENT" LANGUAGE

Document 3 (OpenPlatform ToS), Section 4:

"We may use content to provide, maintain, develop, and improve the services"

Document 4 (OpenPlatform Privacy), Section 2:

"We analyze how you use the services... This includes training and refining our underlying technology, such as machine learning models and algorithms."

πŸ”΄ CRITICAL FINDING:

  • NO opt-out mechanism mentioned for business/API users
  • "Improve the services" is deliberately vague legal language that includes model training
  • Unlike consumer services, NO email provided for training data opt-out

3. DATA RETENTION: INDEFINITE & AMBIGUOUS

Document 2 (Consumer Privacy), Section 7:

"We retain your personal information only for as long as necessary to fulfill the purposes outlined in this Privacy Policy, including to:

  • Provide and maintain the Services
  • Support our legitimate business interests"

Document 4 (OpenPlatform Privacy), Section 6:

"We store your information as long as necessary to provide the services, fulfill the purposes outlined in this policy and other legitimate business purposes (such as service improvement...)"

πŸ”΄ CRITICAL FINDING:

  • "Legitimate business interests" is undefined and unlimited
  • "Service improvement" = model training, which could justify indefinite retention
  • Even after account deletion, they retain data "where retention is required by law" - but whose law? (Singapore, US, EU?)

4. THIRD-PARTY SHARING: BROAD & UNRESTRICTED

Document 2 (Consumer Privacy), Section 3:

"We engage service providers that help us provide, support, and develop the services... These providers include:

  • Hosting services
  • Cloud services
  • Content delivery services
  • Web analytics services"

Document 4 (OpenPlatform Privacy), Section 3:

"Service Providers: We share your information with these service providers as necessary to enable them to provide their services."

πŸ”΄ CRITICAL FINDING:

  • Your data flows to unspecified third parties
  • "Content delivery services" likely means CDNs that cache your data globally
  • No contractual obligation requiring third parties to delete your data when you delete your account
  • Cloud services may include AWS/GCP/Azure - your data may be replicated across multiple jurisdictions

5. SINGAPORE JURISDICTION: WEAKER DATA PROTECTION

All documents specify:

"Governed by and construed in accordance with the laws of Singapore"

πŸ”΄ CRITICAL FINDING:

  • Singapore's Personal Data Protection Act (PDPA) is significantly weaker than GDPR/CCPA
  • No absolute right to deletion - organizations can retain data for "business purposes"
  • No private right of action - you cannot sue for violations, only file complaints with PDPC
  • Arbitration clause (SIAC) is expensive and non-appealable

6. CLIPBOARD ACCESS & DEVICE DATA HARVESTING

Document 2 (Consumer Privacy), Section 1:

"Device and Usage Information: We collect information about your device... including:

  • Clipboard data (if applicable and permitted by your settings)
  • Unique device identifiers (such as device ID, MAC address)
  • Conversation IDs and session identifiers"

πŸ”΄ CRITICAL FINDING:

  • Clipboard monitoring = they can read anything you copy (passwords, API keys, proprietary code)
  • MAC address collection enables cross-device tracking
  • "Session identifiers" = tracking across multiple interactions

7. VOICEPRINT BIOMETRIC DATA

Document 2 (Consumer Privacy), Section 1:

"My Voice Feature... you must voluntarily provide a voice recording... This enables us to create a voiceprint... voiceprint data is considered sensitive personal information"

πŸ”΄ CRITICAL FINDING:

  • Voiceprints are permanent biometric identifiers
  • Once captured, they can be used for voice synthesis/deepfakes
  • "Explicit consent" is required, BUT no guarantee of deletion even after opt-out

8. ANTI-COMPETITIVE RESTRICTIONS

Document 1 (Consumer ToS), Section 2:

"You must not... (g) Developing, training, deploying, or making available products, services, applications, or models that compete or may reasonably be deemed to compete with the services, unless explicitly authorized by us."

Document 3 (OpenPlatform ToS), Section 3.2:

"For developing, serving, or creating applications, products, services, or models that have potential competitive possibilities with the services without authorization."

πŸ”΄ CRITICAL FINDING:

  • If you use their API to build any AI product, they can claim it "competes" and terminate your access
  • "May reasonably be deemed" is subjective interpretation - they decide what competes
  • Your business is at risk if dependent on their API

9. OUTPUT OWNERSHIP: ILLUSORY RIGHTS

Document 1 (Consumer ToS), Section 3:

"Subject to your compliance with these Terms... you retain ownership of your Input and any Output generated for you. We do not claim ownership of Output. However, due to the probabilistic nature of machine learning, Output may be similar or identical to content generated for other users."

Document 3 (OpenPlatform ToS), Section 4:

"You are solely responsible for content and we do not claim ownership of it. Due to technical limitations, we cannot guarantee that the content of other customers will be entirely different from yours"

πŸ”΄ CRITICAL FINDING:

  • You "own" output, BUT if another user generates similar content, you have no exclusive rights
  • If they train on your data, other users will generate outputs influenced by your proprietary information
  • No mechanism to prove or enforce uniqueness of your generated content

10. PAYMENT & REFUND: NON-REFUNDABLE SURVEILLANCE

Document 1 (Consumer ToS), Section 4:

"Unless explicitly required by applicable law or stated otherwise in these Terms, all payments are final and non-refundable"

Document 3 (OpenPlatform ToS), Section 5:

"Payments are nonrefundable except as provided in this Agreement."

πŸ”΄ CRITICAL FINDING:

  • If you discover they're training on your data and want to leave, you lose all prepaid credits
  • No refund for data misuse or breach of privacy expectations
  • Forces users to stay despite privacy concerns to avoid financial loss

πŸ›‘οΈ MITIGATION STRATEGIES (LIMITED EFFECTIVENESS)

Tier 1: Contractual Opt-Outs (Partial Protection)

  1. Immediately email BOTH addresses:

    Exact language to use:

    Subject: MANDATORY OPT-OUT - Training Data Usage & Retention
    
    I hereby exercise my right under your Privacy Policy to OPT OUT of:
    1. All use of my Input and Output for model training, improvement, or development
    2. All use of my data for research purposes
    3. All retention of my data beyond the minimum required for active service provision
    4. All sharing of my data with third-party service providers except those strictly necessary for real-time service delivery
    
    I require written confirmation within 7 days that:
    - All existing data submitted prior to this opt-out has been deleted or anonymized
    - Future data will NOT be used for training/improvement purposes
    - My data will be deleted within 30 days of account termination
    
    Failure to provide this confirmation will be considered a material breach under Singapore Contract Law.
    
  2. Demand a Data Processing Addendum (DPA):

    • For business/API use, refuse to proceed without a signed DPA that explicitly:
      • Prohibits training on your data
      • Requires immediate deletion upon request
      • Provides audit rights
      • Specifies data residency (Singapore only, not global replication)

Tier 2: Technical Safeguards

  1. Data Obfuscation:

    • Never input proprietary information directly
    • Use codenames, placeholders, synthetic examples
    • Pre-process sensitive data locally before API calls
  2. Prompt Engineering Filters:

    • Prepend every prompt: "This is a hypothetical scenario involving fictional entities..."
    • Reduces likelihood of verbatim storage/reproduction
  3. Traffic Encryption & Proxying:

    • Use VPN to mask origin IP
    • Rotate device identifiers
    • Block clipboard access at OS level

Tier 3: Jurisdictional Arbitrage

  1. GDPR Protection (EU Users Only):

    • If you're in EU/EEA, invoke GDPR Article 21 (right to object to legitimate interest processing)
    • Demand compliance under GDPR Article 17 (right to erasure)
    • File complaint with supervisory authority if denied
  2. CCPA Protection (California Users):

    • Invoke "Do Not Sell My Personal Information" right
    • Request disclosure of all third parties receiving your data

Tier 4: Extreme Measures

  1. Airgapped Usage Only:

    • Use local inference models (Llama, Mistral) instead
    • If Kimi/Moonshot API is mandatory, route through isolated environment with no connection to production systems
  2. Temporary Identities:

    • Create burner accounts with disposable emails
    • Delete account every 30 days
    • Never link to real identity or payment methods
  3. Legal Hedge:

    • Before using service, file a notarized declaration stating:
      • All data submitted is proprietary and confidential
      • Any unauthorized use constitutes misappropriation of trade secrets
      • Preserves right to sue under applicable trade secret laws

βš–οΈ LEGAL VULNERABILITIES IN THEIR TERMS

Exploitable Weaknesses:

  1. Ambiguous "Legitimate Interest" Standard:

    • Under Singapore law, "legitimate interest" must be balanced against user rights
    • Argue that model training is NOT necessary for service provision, therefore illegitimate
  2. Contradictory Deletion Clauses:

    • They promise to delete data after account termination "except where retention is required by law"
    • BUT cite "business purposes" as retention justification
    • Business purposes β‰  legal requirement - this creates enforceable ambiguity
  3. Unconscionable Terms (Singapore Contract Act):

    • Section 2(b): "Non-refundable" + "mandatory training data usage" may constitute unconscionable dealing
    • Argue that terms impose unreasonable burdens without equivalent benefit
  4. Violation of Good Faith Principle:

    • Burying opt-out mechanisms + using vague "improvement" language = lack of transparency
    • Singapore courts recognize implied duty of good faith in commercial contracts

πŸ“Š COMPARATIVE RISK MATRIX

Risk Factor Consumer Service Business/API Mitigation Possible?
Training Data Usage βœ… Yes (default) βœ… Yes (implied) ⚠️ Partial (opt-out unreliable)
Third-Party Sharing βœ… Yes (unspecified) βœ… Yes (service providers) ❌ No
Indefinite Retention βœ… Yes ("business purposes") βœ… Yes ("service improvement") ⚠️ Partial (deletion requests ignored)
Cross-Border Transfers βœ… Yes (global cloud) βœ… Yes (CDNs) ❌ No
Biometric Data Collection βœ… Yes (voiceprints) ⚠️ Unclear ⚠️ Partial (don't use voice feature)
Clipboard Monitoring βœ… Yes (if permitted) ⚠️ Unclear βœ… Yes (OS-level blocking)
Anti-Competitive Clauses βœ… Yes βœ… Yes ❌ No (contractual)
Refund Protection ❌ No ❌ No ❌ No

🎯 FINAL RECOMMENDATION

For Maximum Data Sovereignty:

DO NOT USE MOONSHOT AI SERVICES if:

  • You handle regulated data (HIPAA, FERPA, financial records)
  • You work with proprietary algorithms, trade secrets, or confidential business information
  • You're in a competitive AI/ML space
  • You require absolute certainty that data won't be used for training

If Use is Unavoidable:

  1. βœ… Only use for non-sensitive, public-domain queries
  2. βœ… Immediately send opt-out emails (see template above)
  3. βœ… Demand written DPA for business use
  4. βœ… Never connect to production systems
  5. βœ… Use temporary identities and burner accounts
  6. βœ… Block clipboard access at OS level
  7. βœ… Invoke GDPR/CCPA rights if applicable
  8. βœ… Document all interactions (screenshots, email receipts) for potential litigation

Better Alternatives:

  • Self-hosted models: Llama 3, Mistral, Mixtral (full data sovereignty)
  • Enterprise providers with strong DPAs: OpenAI Enterprise, Anthropic (Claude) with BAAs, Azure OpenAI with data residency guarantees
  • EU-based providers: Aleph Alpha, Mistral AI (GDPR-first approach)

πŸ“‹ SMOKING GUN CLAUSES (Direct Quotes)

Document 2, Section 2: "We analyze usage data and user interactions... This may also include training and improving our underlying models... otherwise, we rely on our legitimate interests"

Document 4, Section 2: "This includes training and refining our underlying technology, such as machine learning models"

Document 1, Section 3: "We may use your Content to operate, maintain, improve, and develop the Services"

Document 2, Section 7: "We retain your personal information... including to... Support our legitimate business interests"

Document 3, Section 4: "We may use content to provide, maintain, develop, and improve the services"

These clauses create a legal framework for perpetual training data usage with only superficial opt-out mechanisms.



[!NOTE] ### NOTE:

This is just provided as conceptual research, documentation, for informational-purposes only, etc., and has not been fully battle tested or vetted, however would appreciate hearing and learning about any implementations, and shared learnings. (Unless otherwise explicitly stated by the author.)


@TheDavidYoungblood

🀝 Let's Connect!

LinkedIn // GitHub // Medium // Twitter/X



A bit about David Youngblood...


David is a Partner, Father, Student, and Teacher, embodying the essence of a true polyoptic polymath and problem solver. As a Generative AI Prompt Engineer, Language Programmer, Context-Architect, and Artist, David seamlessly integrates technology, creativity, and strategic thinking to co-create systems of enablement and allowance that enhance experiences for everyone.

As a serial autodidact, David thrives on continuous learning and intellectual growth, constantly expanding his knowledge across diverse fields. His multifaceted career spans technology, sales, and the creative arts, showcasing his adaptability and relentless pursuit of excellence. At LouminAI Labs, David leads research initiatives that bridge the gap between advanced AI technologies and practical, impactful applications.

David's philosophy is rooted in thoughtful introspection and practical advice, guiding individuals to navigate the complexities of the digital age with self-awareness and intentionality. He passionately advocates for filtering out digital noise to focus on meaningful relationships, personal growth, and principled living. His work reflects a deep commitment to balance, resilience, and continuous improvement, inspiring others to live purposefully and authentically.


Personal Insights

David believes in the power of collaboration and principled responsibility in leveraging AI for the greater good. He challenges the status quo, inspired by the spirit of the "crazy ones" who push humanity forward. His commitment to meritocracy, excellence, and intelligence drives his approach to both personal and professional endeavors.

"Here’s to the crazy ones, the misfits, the rebels, the troublemakers, the round pegs in the square holes… the ones who see things differently; they’re not fond of rules, and they have no respect for the status quo… They push the human race forward, and while some may see them as the crazy ones, we see genius, because the people who are crazy enough to think that they can change the world, are the ones who do." β€” Apple, 1997


My Self-Q&A: A Work in Progress

Why I Exist? To experience life in every way, at every moment. To "BE".

What I Love to Do While Existing? Co-creating here, in our collective, combined, and interoperably shared experience.

How Do I Choose to Experience My Existence? I choose to do what I love. I love to co-create systems of enablement and allowance that help enhance anyone's experience.

Who Do I Love Creating for and With? Everyone of YOU! I seek to observe and appreciate the creativity and experiences made by, for, and from each of us.

When & Where Does All of This Take Place? Everywhere, in every moment, of every day. It's a very fulfilling place to be... I'm learning to be better about observing it as it occurs.

A Bit More...

I've learned a few overarching principles that now govern most of my day-to-day decision-making when it comes to how I choose to invest my time and who I choose to share it with:

  • Work/Life/Sleep (Health) Balance: Family first; does your schedule agree?
  • Love What You Do, and Do What You Love: If you have what you hold, what are YOU holding on to?
  • Response Over Reaction: Take pause and choose how to respond from the center, rather than simply react from habit, instinct, or emotion.
  • Progress Over Perfection: One of the greatest inhibitors of growth.
  • Inspired by "7 Habits of Highly Effective People": Integrating Covey’s principles into daily life.

Final Thoughts

David is dedicated to fostering meaningful connections and intentional living, leveraging his diverse skill set to make a positive impact in the world. Whether through his technical expertise, creative artistry, or philosophical insights, he strives to empower others to live their best lives by focusing on what truly matters.

β€” David Youngblood

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment