Creating a Competitive Research and Development Lab for AI and Machine Learning
Phillip Durst, 2022
The competitive edge of AI
Artificial intelligence and machine learning have the potential to take today’s businesses to the next level. Implemented correctly, AI has the capability to increase productivity, save time, and provide key insights into new revenue streams. The capabilities AI can bring to a company include:
AI can automate tedious, repetitive tasks, giving employees more time to focus on new, challenging problems.
AI can quickly and accurately analyze huge amounts of information and make inferences, such as forecasting financial reports.
AI can boost efficiency by automating error-prone tasks, like data scrubbing.
AI can analyze new markets and guide R&D towards new products and revenue streams.
Because of the potential benefits, AI and machine learning are quickly becoming an integral part of modern enterprises. Up to 37% of major companies report using AI in some form for their business. Unfortunately, growing an in-house AI capability isn’t so simple. It is a high-risk, high-reward proposition. To build a competitive and productive AI research lab, a company needs to have a solid process in place. Some general guidance for how to prepare an AI lab is presented in this article.
The National Artificial Intelligence Research and Development Plan
AI has become so ingrained in the future of research and development across global industries that the U.S. Government launched a national initiative for its development in 2019. The goal of this initiative is to find and prioritize AI R&D across government, industry, and academia. The development plan lays out 7 strategies for developing AI R&D. This article will show how to use these 7 strategies as a plan to build an R&D lab.
Strategy 1: Make Long-Term Investments in AI Research
During early development, the goal should be to make smart, long-term investments. Starting with a long-term plan upfront is critical to long-term success.
First, a business needs a vision for AI in the company. Leadership must understand what AI can and cannot do. This is best accomplished by working with both experts in the field and the business’ current analytics team. Early questions need to be addressed. Businesses need to know what IT departments are involved in AI research, whether they need to stand up new departments, and what the lines of collaboration between departments look like.
Second, smart long-term investments must be made in the right people. Despite its popularity, AI is still a burgeoning field of research, and few experts exist. According to the latest reports, there are only roughly 22,000 expert PhDs in AI and machine learning in the world. This makes staffing the biggest challenge you will face in developing an AI research capacity.
AI is a complex and nuanced field. There are many tips and tricks and best practices that can only be learned from experience. Relying on young and inexperienced researchers will hinder a business’s progress, slowing down production and leading to costly errors.
Bringing on experienced talent comes with specific challenges. One factor that can hinder staffing is the academic nature of AI. AI and machine learning engineers and scientists are passionate about contributing to a major new field of research and will want to share their research with the community. A balance must be found between the proprietary nature of industry and the open-source nature of basic research. Finding this balance is an important part of establishing the long-term vision for the company.
Some tips for developing a talented AI research staff include:
Subscribing to the leading journals in AI, such as Foundations and Trends in Machine Learning and various IEEE transactions. Employees need the freedom to share their research by attending conferences and workshops.
Making sure to balance staff between young engineers and scientists and experts in machine learning and AI.
Making a research environment that appeals to workers from an academic background and balances applied and basic research.
Giving engineers and scientists room to disseminate their research and participate in the broader AI community.
Strategy 2: Develop Effective Methods for Human-AI Collaboration
AI can be used to automate many business tasks, such as customer support and language translation. However, businesses that replace employees with AI experience short-term gains but long-term losses in productivity. AI cannot exist in a vacuum and be left to make important decisions on its own.
Researchers need effective methods for working with the AI algorithms. AI should not replace humans, but rather should enable them to work more effectively. Human-AI collaboration, or collaborative intelligence, must be nurtured.
Tasks that require intuition or complex inferences from novel data cause AI to struggle. Human input is critical to the success of AI. Effective human-AI collaboration takes place in three steps.
1. Training.
We live in an environment of information overload. The sheer volume of data available is in itself a serious challenge to applying AI and machine learning. The data available often can’t be trusted to be unbiased and/or accurate. AI needs a human to provide insights into the data it analyzing.
For any given application, a wealth of different kinds of data will be available. But, not all this data will be relevant to every application. Training data sources must be carefully selected with a human-in-the-loop. This training data requires context and metadata, both of which are provided by a human.
Moreover, the number of machine learning algorithms and methods can prove daunting. There are countless ways to use machine learning. Because of the non-determinant nature of AI, all it takes is one or two mistakes up front to cause wildly inaccurate results. A human must guide the training of an AI algorithm to prevent these mistakes.
2. Explaining
To non-experts, AI looks like a “black box.” Data goes in, magic happens, and results come out. These results are often used for critical decisions, like planning financial investments or diagnosing diseases. Leadership and decision makers will need a human expert to explain the algorithm outputs. This task goes back to making long-term investments: the right experts are needed to translate AI outputs.
3. Sustaining
Again, thinking long term, human-AI collaboration will be needed to foster and improve a company’s AI R&D. AI brings a host of new challenges to R&D, and collaborative intelligence will be needed to solve these challenges. Human experts are needed to work with the state-of-the-art in AI to keep a company’s AI capabilities at the forefront of AI research. Sustained collaborative intelligence keeps a business competitive.
Strategy 3: Understand and Address the Ethical, Legal, and Societal Implications of AI
One of the new challenges AI brings is a host of safety concerns, including physical, ethical, and legal concerns. If the AI fails, who is responsible?
For example, if a driverless car causes an accident, who is at fault: the driver, the car manufacturer, or the programmer? Or if automated construction equipment fails and injures or kills workers. Businesses must clearly define the capabilities and limitations of their AI and institute policies to both reduce physical danger and establish liability for accidents.
Another example is AI for storing user data. Giving AI access to personal information raises ethical concerns. AI can be a powerful tool for monitoring trends among customers. AI can use these trends for targeting ads and suggesting connections. Is this ethical, or is this a violation of customer privacy? Can the AI be trusted to make safe decisions with this data?
AI also raises legal concerns. Who is legally responsible if AI makes an illegal decision? Or, if AI data is used to take actions that break the law, is the AI or the user at fault? These concerns are another issue that companies must account for. A clear plan of what an AI can do ethically and legally must be established while making long-term investments in AI.
A good guide to ethics in AI is the “Understanding artificial intelligence ethics and safety” report published by the Alan Turing Institute. This document can be found at: https://www.turing.ac.uk/sites/default/files/2019-06/understanding_artificial_intelligence_ethics_and_safety.pdf
Strategy 4: Ensure the Safety and Security of AI systems
AI systems must also be kept secure. AI training data often contains sensitive information that must be protected. AI systems can also be fed malicious data to produce corrupt results. A minimum degree of IT security must be guaranteed for AI to be successful. AI security can be achieved through:
Careful storage of data. Keep physical data storage in secure areas, and ensure cloud data is protected from unauthorized users.
Monitoring training data. Keeping AI secure requires keeping training data valid and accurate.
Monitoring of algorithms. Human-AI collaboration can provide checks on the validity of algorithm output.
Strategy 5: Develop Shared Public Datasets and Environments for AI Training and Testing
Continuing the topic of training data, training data sets must be maintained and, where appropriate, shared. Training data is timely and expensive to collect and label. Sharing these datasets is beneficial to the broader AI community.
A business must find a balance between internal and external data. Sharing data may seem counterintuitive, but adding to the public knowledge of AI will ultimately be of benefit. Advancements in AI can be rolled up into their labs and R&D.
Using public datasets also saves a company money. Again, creating training data is expensive. Using public datasets saves time and money on AI training. Moreover, public datasets are easy to use to verify AI algorithms. Using these datasets and environments provides a baseline for measuring AI performance.
Some other benefits of helping develop public datasets and environments include
Sharing / using shared data and environments helps retain research staff. Working with this data gives them options for publication and public dissemination of research.
Sharing / using shared data and environments lets staff leverage open-source AI solutions as the backbone of their research.
Sharing / using sharted data and environments saves time and money on costly data collection and labeling.
Strategy 6: Measure and Evaluate AI Technologies Through Standards and Benchmarks
AI performance is uniquely hard to benchmark. Progress in algorithm performance is measured by comparing new algorithm performance to baseline performance of other extent algorithms. In particular, the metrics most commonly used are:
Precision: The ability of a model to identify and output only the relevant data.
Recall: The ability of a model to find all the relevant data in a dataset and use it to create outputs.
Normalized Discounted Cumulative Gain (NDCG): comparison between the baseline (human judged) outputs and the algorithm’s outputs.
However, many issues plague benchmarking for AI. Because of the novelty of the field, there aren’t enough results to perform rigorous benchmarking. One is the use of weak baseline data sets. Here is where using public sources becomes so important to creating an AI R&D lab. Intelligence collaboration sustainment in a company’s lab lets engineers choose which AI algorithms are appropriate and accurate for a company’s given application.
That’s not to say standards and benchmarks for AI aren’t being aggressively pursued. Much of this work is academic, and many good papers can be found on the topic. For a business to succeed at AI, it must carefully choose how to benchmark its AI. Again, this will require the right people who can wade through the academic research and find the right tests for their company.
Strategy 7: Better Understand the National AI R&D Workforce Needs
Understanding the national AI R&D workforce is a big part of long-term planning. Early investments must be made to onboard the best AI researchers. As discussed in Strategy 1, this workforce tends to be academic in nature. They need freedom to interact with the national R&D community.
The workforce also needs freedom to explore basic research. Pursuing basic research allows a company to sustain their AI capabilities. Basic research will also make a business an early adopter of new methods of AI R&D. While building an AI capability, a business needs to find a balance between academic and applied engineers.
When struggling to find experts, it gets tempting to grab fresh researchers coming out of school. However, AI is a complex and nuanced field. There are many tips and tricks and best practices that can only be learned from experience. A company must make sure to court experienced personnel in the national workforce. At the same time, they must search for and grow young talent.
Summary
AI and machine learning are dramatically changing the way businesses operate. They have the potential to be both powerful tools that raise a company’s bottom line and also dangerous tools that give misleading or inaccurate results. Fortunately, with a few key up-front decisions, the likelihood of AI positively impacting a business can go up dramatically. The best practices for developing an AI R&D capability were presented in this article according to the seven strategies laid out in The National Artificial Intelligence Research and Development Plan. In general, the steps to building an R&D research lab are:
Establish a long-term vision.
Hire the right workforce.
Foster human-AI collaboration.
Ensure safety and security of AI algorithms and data
Manage legal and ethical considerations
Leverage public data and environments.
Benchmark AI algorithm performance for sustainment and improvement.
Following these strategies will ensure a business quickly, effectively, and safely creates an AI R&D capability.