These days only lazy people don’t write about ChatGPT and large language models (LLM). Vendors are trying to be the first to announce a ChatGPT integration even when they don’t have anything serious to show. I’ve also written about it: see “ChatGPT Producing Simple Decision Models” and “LLM and Decision Modeling“. This weekend I decided to help ChatGPT (that is now at GPT-4) to address the Challenge “Permit Eligibility” published by DMCommunity.org. It has a simple rule: “An applicant is eligible for a resident permit if the applicant has lived at an address while married and in that time period, they have shared the same address at least 7 of the last 10 years.” But this rule contains several tricky assumptions – no wonder, DM vendors are not in a hurry to submit a solution.
A previous solution generated by ChatGPT was really bad. Instead of criticizing the current ChatGPT capabilities, I thought I should try to assist ChatGPT to produce a more reasonable solution. Below I describe the results of my efforts split into 3 parts:
- Part 1. My Dialog with ChatGPT
- Part 2. Validation of the ChatGPT Solution
- Part 3. Manual Conversion of ChatGPT Solution to Working Java Code.
In Part 1 I instructed ChatGPT to produce not just a pseudocode but a complete Java solution. It did and produced a “good-looking” Java code! It refused to execute the generated code saying: “As an AI language model, I don’t have a local environment to execute the code. However, assuming the implementation is correct, this code should output: Is the applicant eligible for a resident permit? true”.
Unfortunately, this was an invalid answer. Anybody can manually apply the rule to the Challenge test data and confirm that this applicant is NOT eligible for a resident permit.
Then in Part 2, I put this solution in Eclipse IDE to run it and analyze what went wrong. Quite quickly I found that ChatGPT produces a syntactically looking-good code with serious semantical drawbacks:
- Dealing with years rather than with days inside different periods
- Residence periods do not always have to be inside Marriage periods, they may intersect
- Double calculation of estimated years
- Ignoring less than 1-year periods.
A partial correction of this code would not be productive, so I decided to essentially redesign it in Part 3 where I manually converted the ChatGPT code into a working Java program. See all details in my just published solution.
A few comments. ChatGPT is the acronym for Generative Pre-trained Transformer. It can transform (translate) a problem representation presented in one language to another language. It works as a human translator who knows both source and target languages but who is not necessarily aware of the subject matter s/he needs to translate. For instance, such a human translator may reasonably well translate a scientific topic in biology, physics, or decision modeling without knowing anything in these areas. So, only a human specialist can judge the quality of the translation.
Considering these limitations, it is amazing how much even today’s GPT-4 managed to do by translating my initial plain English description with samples of test data into Java language. We may only imagine how better and more powerful GPT-5, GPT-6, etc. will be. Probably today we should not even ask ChatGPT to build complete relatively large decision models.
However, ChatGPT already does a great job when being asked to solve a small task. For instance, look at this dialog:
ChatGPT produced a ready-to-go Java code. No wonder, today the most successful use of the LLM model is playing the role of helpers or code completion modules in software IDEs. It means soon we will see similar LLM models incorporated into Decision Modeling IDEs. Stay tuned!