Currently, the system only uses open data sources and has already helped achieve impressive results. However, its capabilities are not exhausted. Closed data sets, such as information about company founders, invoices, information about family ties, and others, can be used to identify many risks.
In 2023, the plan is to transfer the system from Datanomix servers to the secure network of the General Prosecutor's Office. This will allow more data to be loaded into it from various government databases, as the General Prosecutor's Office has the authority to access this data, and the data will be securely protected on its servers without connecting to the Internet. This will allow for better identification of affiliation between customer employees and suppliers, the development of scoring based on signs of fictitious business activity to identify risky suppliers, and so on.
Another area of development is working with unstructured data. For many risks, it is necessary to analyze textual unstructured information from technical specifications. The main way to restrict competition in government procurement remains the specificity of requirements for a particular supplier. Large Language Models (analogous to GPT-4, trained for a specific task on data from the government procurement portal) will be used to identify such intentional obstacles to fair competition. This will allow analyzing the texts of any procurement documentation however they are formulated.