Large Language Models (LLMs) such as within ChatGPT are advanced models designed to process and generate natural language outputs. They are trained on large-scale datasets to understand and predict text. By learning complex dependencies between words over long texts, they capture relationships in language and generating context-aware text based on the patterns they learn. Hence, LLMs are suitable for tasks like translation, summarization, and content generation.
On the other hand, optimization presents a different kind of challenge, requiring the identification of the best solution to a problem within given constraints by minimizing or maximizing a target function. For example, consider packing a backpack with various items, each assigned a specific value. The objective is to maximize the total value of the items packed while adhering to the backpack's 40-liter capacity limit. This scenario illustrates an optimization problem: how to achieve the highest overall value while adhering to the space constraint. And many other examples exist, literally whenever a decision must be made, requiring the selection of the best solution from a range of possible options.
More generally, optimization problems often require precise, optimal or feasible solutions in a large, discrete search space. LLMs do not guarantee correct solutions in optimization tasks; they provide solutions that may offer approximate guidance but lack assurances of feasibility. One could simply say that LLMs don’t calculate, they are rather making educated guesses.
Optimization algorithms determine whether a solution is feasible by checking if it satisfies the problem's constraints, by checking if candidate solutions adhere to these rules. If a solution meets all the necessary conditions, it is feasible. If not, the algorithm will either adjust the solution or try a different one until it finds a valid, optimal outcome. LLMs cannot perform feasibility checks because they are not designed for numerical computations. Feasibility in optimization requires evaluating whether a solution satisfies specific constraints or not, which LLMs are not trained to do.
An example in that sense is solving a Sudoku puzzle. At it’s core this is a very simple optimization problem. For perspective a 9x9 sudoku in a naive way can be formulated as an optimization problem with 9x9x9=729 variables, however, optimization problems in practice easily surpass tens or hundreds of thousands of variables.
An LLM “solves” a Sudoku by predicting all the missing numbers based on patterns it has already learned. However, in this guessing there are no explicit checks ensuring each number appears only once per row, column, and 3x3 box. This can work well for simpler sudokus, but it will struggle with harder ones. In contrast, an optimization algorithm tackles Sudoku by systematically applying rules and constraints, searching for the correct number in each cell through logical reasoning and backtracking, guaranteeing a valid solution for any Sudoku puzzle.
To illustrate this, take a simple Sudoku example, for which ChatGPT outputs the correct result only if we specify to start with ‘4 3 5‘, otherwise it fails in producing a correct result.
One can easily see that the Sudoku rules were violated when no additional help was given and that ChatGPT even changed the input data. You can easily try this yourself—simply copy and paste the Sudoku image into a ChatGPT prompt.
This is why, combinatorial optimization problems are always approached and solved with specialized algorithms (like branch-and-bound, or dynamic programming techniques) that are designed to guarantee feasible solutions and can provide clear steps in how solutions are obtained. Don’t believe us? Simply ask ChatGPT about its limitations and it will respond ‘As an LLM, I’m not built to directly "solve" large, complex optimization problems like specialized optimization solvers are. I can, however, assist with formulating, simplifying, and explaining approaches to such problems.’.
At Quantagonia, we believe the key lies in combining the both worlds: we provide a user-friendly LLM that translates between the plain language in which the user describes to problem and the complicated mathematical language required by our optimization solver.