Should I Learn Computer Programming in the AI Era?

Generated by Bing Image Creator.

In the previous section, I illustrated how ChatGPT affects our workflow and amplifies our writing capabilities.

Lately, the most frequent question I receive from students is whether it is necessary to learn computer programming, given the impressive capabilities demonstrated by AI systems in coding even complex tasks. The answer is “no” if the scope of programming you require aligns with what AI systems can already handle proficiently, in which cases, acquiring in-depth programming skills may not be your top priority. Conversely, the answer is a resounding “yes” if your aspiration involves crafting programs that go beyond the capabilities of AI, as the true essence of learning programming lies in the ability to create customized solutions that AI systems cannot readily generate on their own. The question becomes particularly intricate when we consider the current and future levels of programming that AI can achieve.

In July 2023, OpenAI unveiled a ChatGPT plugin named “Code Interpreter,” which can generate code and corresponding outputs for general tasks. It is particularly remarkable in data manipulation and analytics, even outperforming an experienced programmer like myself for tasks such as data visualization that I do not engage in so often. In the present landscape, a growing number of students, not pursuing a Computer Science major, are keen on acquiring foundational programming skills. Part of their motivation lies in aspiring to become data analysts, a career path that has garnered substantial attention due to its attractive average base salary (Indeed, Glassdoor). Possessing even basic programming skills provides these students with a significant advantage over those who lack such skills.

Having taught many of these students, I have come to realize that, for most of them, advancing programming skills beyond the basic level may not be necessary for a successful career development. Unlike Computer Science students, whose primary focus is to advance computer programs, their primary objective is to analyze data, and computer programming is simply an effective means to perform that task. If there are alternative tools, such as AI systems, capable of executing the job more efficiently, learning programming might no longer be an essential component for excelling as a data analyst. If you are a student considering a career in data analysis but do not have a strong inclination toward programming, I strongly recommend delving deeper into the fundamentals of data analytics (e.g., statistical concepts, descriptive analytics, ethics) and learning how to leverage AI systems to execute these concepts. This approach can be more beneficial than enrolling in a basic programming language course.

Most advanced AI systems today rely on deep learning technology, which demands a substantial amount of data for effective training. If you aim to develop a proficient deep learning model for computer programming, you must train it with extensive datasets containing task descriptions (inputs) and corresponding purposeful code (outputs). Fortunately, a wealth of code examples is readily available on the internet (e.g., textbooks, programming competitions, discussion forums), making them valuable training data sources. Then, to what extent will these AI systems excel after being trained on all this data?

Deep learning models, especially ones used for generative AI, are designed to reflect the average quality of the data on which they are trained. This is why major tech companies invest significant time and resources in curating high-quality training data to produce more robust models. In essence, even if a model is trained on diverse data from the internet, it will perform at the level of the average programmers represented in that data pool. To ensure an AI system consistently generates high-quality code, you must enhance the overall quality of your training data. This can be achieved by training exclusively with credible resources such as textbooks or official solutions from programming competitions. However, these resources do not cover the full spectrum of tasks required for real-world applications. Therefore, to broaden the model’s task coverage, you must incorporate code samples from various sources, including discussion forums that address practical problems people encounter. As you integrate this diverse data, your model will become capable of handling a wider range of tasks. However, the overall quality of code generated by the model trained on this amalgamated data will inevitably decrease.

Generating high-quality data for a wide range of real-world application tasks is an exceedingly expensive endeavor. Even for tech giants like Google and Microsoft, achieving such a comprehensive dataset would likely take a decade or more; otherwise, they would have already accomplished it. Consequently, it will still be some time before AI systems can autonomously generate programs at the level of the top 20% of programmers. Therefore, if your ambition in the field of programming is to be among the top 20%, you should definitely pursue that goal. Your skills will be in even higher demand in the future, as tech companies will eagerly seek individuals like you to produce high-quality code for training AI models. However, when it comes to tasks within the capacity of average programmers (i.e., tasks that most programmers can handle), AI systems are poised to match human capabilities in the foreseeable future. Hence, if your goal as a programmer is to work on tasks that fall within this category, it may be prudent to reconsider, as the job may not remain as highly rewarding as it is today.

Dr. Jinho Choi
Exit mobile version