2024-12-03

Legal risks and compliance practices facing the development of AI enterprises

Author：Li Shengnan

01.Overview of laws and regulations

02.Artificial intelligence enterprise compliance points

(1) Data collection and use

1. Collect by yourself

At present, many enterprises have the problem of illegal collection of personal information, taking Didi as an example, because of illegal collection or excessive collection of personal information was imposed a huge fine of 8.026 billion, equivalent to 5% of Didi's turnover in the previous year. Ai companies usually collect data by themselves through apps, small programs and other products, and how to formulate privacy policies when personal information collection is involved? How is Personal Information collected? It is not difficult to see by referring to the enterprises that have passed compliance rectification at present, if the products of artificial intelligence enterprises involve apps, small programs or websites, they must formulate privacy policies, and prompt users to visit and click to confirm through obvious ways such as pop-ups when they are first run. Do not start collecting information without the consent of the user. The information collected must comply with the principle of minimum necessity, no information unrelated to the provision of services shall be collected, and no information shall be provided or processed to others without consent.

2. Crawler collection

Since many business scenarios of artificial intelligence enterprises rely on a large amount of data feeding, one of the ways of data source is to use crawler technology to capture data from the network, where is the legal use boundary of enterprises relying on crawler technology?

For example, Z Company is an Internet big data company that provides digital transformation services. Its chief technology officer and technicians have been prosecuted for using crawler technology to illegally obtain data from a food delivery platform. They used both external crawling (using technical means to breach network security measures) and internal crawling (using account passwords and browser plug-ins to violate user agreements). Although this data is not personal information, our law also provides protection for such commercial data. In the end, Z company was dealt with for causing a loss of more than 40,000 yuan. Coincidentally, Capricorn Technology provides risk management services for a number of financial institutions, and illegally obtains sensitive personal information of users through crawling technology, including call records, social security, etc., involving 20 million pieces of data. The company and its legal representative were sentenced to three years in prison and fined for infringing citizens' personal information.

In addition to criminal risks, improper collection and utilization can also have civil risks. The civil risk is more embodied in unfair competition. The defendant, Wind Technology Company, without Autonavi's permission, illegally captured Autonavi's "congestion delay index" data by changing IP addresses and forging browser logos. Although this data is not personal information, Wind Technologies stores this data in its terminal software and disseminates it to paying users for commercial purposes. Autonavi therefore believes that business users may see this data and choose to use Wind technology products rather than Amap. Therefore, Autonavi sued Wind Technology on the grounds of unfair competition. In the end, the court supported Autonavi's appeal, and ruled that Wande Technology compensated Autonavi for a total of 12.5 million yuan in reasonable losses, and also recognized the legal nature of the "congestion delay index" data.

In another case, the defendant circumvented the relevant protective measures of Weibo by changing IP addresses and Weibo user accounts and other technical means, illegally obtaining public data of Weibo and selling it on his platform, with illegal income of more than 20 million yuan. The defendants illegally obtained a large amount of data from Weibo through deceptive technical means and sold it to unspecified users, increasing the risk of substantial replacement of the Weibo platform, which may lead to data security problems such as personal privacy and sensitive information disclosure, and disrupt market order. Therefore, the court finally found that the defendant constituted unfair competition and sentenced him to pay 20 million yuan in compensation for economic losses and reasonable costs for rights protection. Whether it is the capture of Weibo data or Autonavi data, Chinese law provides protection for the data generated by relevant platforms and recognizes its value.

In network data processing, an important evaluation must be performed when using automated tools to crawl network data. Evaluate whether the crawler's behavior will illegally infringe others' network rights and interests and interfere with the normal operation of network services.

After evaluation, it is necessary to clarify the scope and rationality of crawler data. The main evaluation contents are as follows:

(1) The crawled data should be public. For government data conditionally disclosed by the government, it cannot be illegally crawled by bypassing its permission Settings. If access to such data is required, it should be requested in accordance with the prescribed procedure.

(2) Comply with the robots.txt protocol or other public protocol content of the website, and avoid crawling data that is expressly prohibited by the platform. If the platform has issued a notice prohibiting climbing, it should stop climbing immediately and take appropriate countermeasures.

(3) Follow the triple authorization principle to ensure that the user is authorized to the platform, the platform is then authorized to the crawling party, and the user is authorized to the crawling party to form a complete link to ensure the legal compliance of data crawling.

(4) Crawling data should be carried out within a reasonable limit, avoid frequent crawling in a short period of time, and should not destroy or invade the anti-crawling measures of the target website.

(5) It is recommended not to collect personal information and corporate trade secrets, because the risk of climbing these data is extremely high.

(6) Try to avoid climbing the enterprise platform data that has a direct competition relationship with them, so as to avoid the other party to file a lawsuit according to the anti-unfair competition law.

3. Third-party access

In the process of data acquisition, there are also channels to obtain data through third parties. Third party data may be purchased, provided free of charge by third parties, or jointly processed under some agreement. In either case, when data is obtained from third parties, it needs to be fully reviewed and evaluated. Ensure that data obtained from third parties is legal and compliant:

(1) Examine the legitimacy of third-party data sources. Require a third party to provide proof of the legality of the source of the data, and if the data was collected by a third party on its own initiative through the user, a record of the user's consent should be requested.

(2) Review the privacy policy and user agreement displayed by the third party to the user to ensure compliance with relevant laws and regulations.

(3) Require the third party to issue a letter of commitment of legality, promising that the data provided by it meets the requirements of compliance.

(4) Sign data provision or entrusted processing agreements with third parties. The nature and content of the agreement shall be determined according to the actual situation of both parties, including the purpose, scope and security requirements of data use.

(5) Verify whether the third party has the relevant qualifications, licenses, certifications and filings.

4. Training data management

The latest provisions of the Regulation on Network Data Security Management clearly point out that enterprises providing generative artificial intelligence services must strengthen the security management of training data and its processing activities, and take effective measures to prevent and deal with network data security risks. This regulation establishes the responsibility of AIGC-related processors, data processors and enterprises for the safety management of training data and processing activities, which is clearly stipulated by laws and regulations.

In addition, Article 24 of the Regulations on the Administration of Network Data Security clearly stipulates that if an enterprise inevitably collects unnecessary personal information when using automated collection technologies (such as crawler technology), or fails to obtain the consent of the personal data subject in accordance with the law, the enterprise must delete such personal information in a timely manner or anonymize it. Once again, if personal information is involved in the use of crawler technology, companies are obliged to delete and anonymize it. If deletion and anonymization is technically difficult, the company should stop storing personal information and take the necessary measures to protect the security of personal information.

(2) Intellectual property infringement

1, their own intellectual property layout

In the development process of artificial intelligence enterprises, they first need to pay attention to their own intellectual property layout. Intellectual property rights include patents, software Copyrights and trademarks. In particular, the importance of software copyright and patent should be emphasized, and it is suggested that artificial intelligence enterprises should focus on software copyright and pay attention to applying for high-quality invention patents.

Patents and software Copyrights are key elements in the layout of intellectual property rights. When investing in mergers and acquisitions, they can convert their own invention patents and software Copyrights into capital and invest in other enterprises. The layout of intellectual property rights not only helps to enhance the future value of enterprises, but also has great significance for the long-term development of enterprises.

2, intellectual property infringement compliance advice

Both at home and abroad, artificial intelligence enterprises have been sued for intellectual property infringement, and a more typical case abroad is the New York Times lawsuit against OpenAI. The New York Times filed a lawsuit against OpenAI and Microsoft, accusing the latter of using the newspaper's copyrighted content without authorization to train AI models and present them to users in ChatGPT products. According to the incomplete statistics of the "Daily Economic News" reporter, as of the end of June this year, at least 13 news media organizations have filed infringement lawsuits against OpenAI and Microsoft. The case is still pending.

A typical case in China is the Ultraman case, in which the plaintiff A company is the owner of the exclusive authorization of the copyright owner of the "Ultraman" series works. The website operated by defendant B provides AI painting services to top up members for payment by calling large model services provided by third parties. By entering the prompt word involving "Ultraman", the website operated by Company B can generate pictures with the image of "Ultraman" for members to view and download. Company A appealed to the court to find that Company B infringed its copyright, and required Company B to delete the Ultraman material from its training data set, and bear the corresponding loss liability. The court ruled as follows: 1. Company B infringes the plaintiff's right to copy and adapt, and shall immediately stop the infringement. The specific prevention degree of stopping infringement should reach: "The user normally uses the prompt words related to Ultraman, and cannot generate pictures substantially similar to the works of Ultraman involved in the case." 2."With regard to the plaintiff's removal of the Altman material from its training data set, the Court does not support the claim that the defendant did not actually engage in model training." 3. Regarding the claim for compensation for losses, the court held that the defendant was at fault for not taking actions such as establishing a reporting mechanism, alerting potential risks, and carrying out significant identification in accordance with the Interim Measures for the Management of Generated Artificial Intelligence Services and the provisions on the Management of In-depth Synthesis of Internet Information Services, etc., so it ruled that the compensation of 10,000 yuan was awarded.

From the New York Times case against OpenAI to the Altman case, it can be seen that whether it is a company engaged in AI painting, or other specific applications of enterprises, often plug into existing large-scale models, such as Kimi or other widely used models. When accessing a large model, the enterprise shall properly check the legitimacy of the access model and its content control. Where necessary, suppliers can be required to provide relevant instructions and commitment letters to ensure that legal compliance is maintained in front-end operations. Secondly, the enterprise should fully inform the user of the precautions when using the relevant model, especially to remind the user of the possible risk of content infringement and its legal consequences. Enterprises should also establish an emergency response mechanism to provide users with feedback and complaints channels in order to timely deal with the relevant problems encountered by users.

3. Intellectual property risks of open source software

In the course of business, the use of enterprise open source software also involves intellectual property risks. If a company uses software without an open source license, it is possible that using the software would constitute a copyright infringement of the open source rights holder, without an open source license means that users can only browse the software but not use it, and if they use it freely without permission, they may violate copyright. Second, even if the software has an open source license, there are risks if the enterprise does not use the software as required by the license. For example, companies introduce open source software under the GPL license when developing their own software. The GPL license requires that derivative works based on the GPL must also be open source. However, some enterprises choose to commercialize rather than open source after independent research and development, which is inconsistent with the provisions of the GPL license, and will also have the risk of infringing intellectual property rights.

(3) qualification filing

Different AI companies will require different qualifications according to their own business model and specific circumstances.

Common qualification filing such as ICP filing and licensing. The choice of filing and licensing depends on whether the company's website or products or services are operational or non-operational. However, whether it is operational or non-operational, the server must be deployed in China, and overseas servers cannot carry out ICP filing and licensing.

In addition, if the product or service of the enterprise is a network information service, the public security network must be filed. Some websites or apps may not want to be deployed on domestic servers and choose to be deployed abroad. If an artificial intelligence enterprise involves blockchain services, it also needs to record domestic blockchain information services. In addition, there are algorithms and large model records that we are very concerned about, at present, the Shanghai area is more to encourage you to do algorithms and large model records, will also give enterprises a certain bonus to encourage and support, the content of the above qualification license is also the content of the future supervision will focus on.

(4) Trade secrets

1. Case sharing

Trade secret protection is an issue that all kinds of enterprises, including artificial intelligence enterprises, attach great importance to. For AI enterprises, some data may be special due to the nature of their business. For example, a high-tech enterprise that focuses on the research and development of visual artificial intelligence processor chips stores a large amount of confidential data such as core code in its computer room, which constitutes the company's core competitiveness and trade secrets. The company later found an unauthorized computer in the computer room, which was verified to belong to another founder of the company. The founder uploads the company's core data to a personal computer from a computer room. The investigation found that the founder was involved in another company's merger and acquisition activities in another capacity, with the intention of using stolen trade secrets to raise money in the new company after leaving the current company. This behavior clearly violated the legal provisions of trade secret protection, and the founder was eventually prosecuted for the crime of violating trade secrets, and was sentenced to two years in prison and a fine of 100,000 yuan.

In addition to the means of criminal protection, the protection of trade secrets also involves civil disputes. Previously, a company employee obtained the company's customer list while leaving the company and used the information to sell products to customers. The customer reported the practice to the original company, and the court later determined that the customer lists were trade secrets because they contained special information such as non-public contact information that could give the company a competitive advantage. Therefore, the court ordered the employee to compensate the company for the loss of 80,000 yuan.

2. Compliance suggestions

(1) Establish a sound internal management system: enterprises should formulate strict confidentiality rules, set up confidentiality areas, and implement decentralized control strategies to ensure the security of confidential information.

(2) Strengthen personnel management: enterprises should sign confidentiality agreements with employees, and sign non-competition agreements for key technical personnel. At the same time, the management of the departing employees should be strengthened to prevent them from leaking the company's business secrets.

(3) The use of technical means to protect trade secrets: enterprises should encrypt trade secrets and use technical means to protect them to prevent unauthorized access and disclosure.

(4) Improve legal awareness: Enterprises should regularly conduct legal awareness training for employees to ensure that they understand the importance of trade secret protection and relevant legal provisions.

(5) Avoid the threat of generative AI: With the development of AI technology, enterprises need to be alert to the risks that generative AI may bring. Employees may inadvertently input the company's trade secrets into large AI models such as ChatGPT, Kimi, etc., and these models may use the input data for learning and future output, thereby revealing the company's trade secrets. Therefore, companies should advise employees to avoid entering sensitive information into external AI models.

(6) Use of internal AI models: To prevent trade secret disclosure, enterprises should develop their own large-scale AI models and encourage employees to work with these internal models to reduce reliance on external AI models, thereby reducing the risk of data breaches.

(5) Cybersecurity and ethical review of science and technology

1. Network security obligations

Article 22 of the Network Security Law of the People's Republic of China specifies that network products and services should meet the mandatory requirements of relevant national standards. Providers of network products and services shall not install malicious programs; When it is found that its network products and services have security defects, vulnerabilities and other risks, it shall immediately take remedial measures, inform users in a timely manner and report to the relevant competent authorities in accordance with regulations. The providers of network products and services shall continue to provide security maintenance for their products and services; The provision of security maintenance shall not be terminated within the time limit prescribed or agreed upon by the parties. Where network products and services have the function of collecting user information, their providers shall express it to users and obtain their consent; Where personal information of users is involved, the provisions of this Law and relevant laws and administrative regulations on the protection of personal information shall also be observed. In addition, Article 25 requires that network operators should develop network security incident emergency plans to deal with system vulnerabilities, computer viruses, network attacks, network intrusion and other security risks in a timely manner; In the event of an incident endangering network security, the emergency plan shall be immediately activated, corresponding remedial measures shall be taken, and reports shall be made to the relevant competent authorities in accordance with regulations. This is a mandatory obligation for companies and part of their cybersecurity responsibilities.

2. Obligation of scientific and technological ethical review

According to Article 2 of the "Cybersecurity Review Measures"implemented on October 1, 2023, scientific and technological activities involving human research participants, including the use of personal information data, shall be subject to scientific and technological ethics review. Institutions of higher learning, scientific research institutions, medical and health institutions, enterprises, etc., are responsible subjects for the examination and management of science and technology ethics. Units engaged in scientific and technological activities such as artificial intelligence, whose research content involves sensitive fields of scientific and technological ethics, shall establish a scientific and technological ethics (review) committee.

(6) Enterprise sailing and data outbound

When companies deploy business overseas, data outbound is a topic that cannot be ignored. There are two types of data outbound: the data collected in China is directly transmitted abroad; Although the data is stored in China, overseas organizations and individuals can inquire, access, download and export it. The second situation is often ignored by artificial intelligence companies, that is, although the data is stored in the country, the overseas subject or branch, the parent company can access and download, which is also considered as data exit. Common data exit scenarios include directly stored overseas, first stored in China and then synchronized to overseas systems, and shared in Hong Kong, Macao and Taiwan.

For the specific regulatory path of data exit, we can refer to the latest regulations and charts.

The chart clearly shows the different obligations for different data types and situations, and if the data is deemed important, an assessment of data exit security must be carried out. For non-personal information, non-important data can be directly outbound.

For personal information, if there are exemptions such as necessary for the performance of contracts, necessary for human resources management, necessary for emergency situations, you can directly leave the country, and the amount of personal information involved is very small (less than 100,000), and it is not within the scope of supervision. For operators of critical information infrastructure, data exit security assessments are also required. For personal information related to Hong Kong and Macao, it is necessary to file a standard contract; Other regions define different regulatory paths based on the type and volume of data.

In addition to data outbound, artificial intelligence enterprises need to pay attention to the relevant regulations and restrictions of enterprises going abroad when expanding overseas business. The export restrictions come in the context of protecting core technology from leaks, such as when the Trump administration demanded that Bytedance sell TikTok or face a ban. In the same month, the Ministry of Commerce and the Ministry of Science and Technology of our country adjusted and issued the Catalogue of China's Prohibited Export Restricted Export Technologies, which listed"personalized information push service technology based on data analysis"as one of the technologies restricted export. That means TikTok relies on core algorithmic technology that falls under the Chinese government's export controls. This also means that the transfer of the above-mentioned technologies to other countries requires the approval of the relevant departments of the Chinese government.

In addition, restrictions on cloud computing in the United States also have an impact on AI companies. Many companies use US cloud computing vendor services to train large AI models, such as Amazon AWS. But in January 2024, the U.S. Department of Commerce released proposed rules for customer identification related to IaaS cloud services for public comment. The rule requires US IaaS providers to restrict foreign customers, especially Chinese customers, from using the services of US cloud computing vendors to train their AI big models by implementing customer identity verification procedures and reporting foreign customers' detailed identity information and AI big model training activities to the US Department of Commerce when relevant conditions are met. Once the rule takes effect, it will affect the acquisition of computing power by Chinese enterprises in AI training.

(7) Enabling data assets

Data capitalization refers to the transformation of data resources into assets with economic value and the management and operation of them. Artificial intelligence enterprises have a large amount of data and data processing capabilities, and they can pay more attention to the specific path of data assets in the future, including but not limited to different landing methods such as data resources entering the table and data products entering the number of exchanges. The entry of data resources into the table can improve the credit and financing ability of enterprises, enhance the income of data assets of enterprises, and enhance the data innovation ability. In addition, enterprises can also put their big data products on the data exchange for floor trading, and realize the circulation of data transactions, but also further financing.

Legal risks and compliance practices facing the development of AI enterprises

Recommended Information