ChatGPT: Analyse 1Y of EV6 CAN Bus Data [Code Interpreter]
Need to analyse tons of CAN bus data, fast?
In this intro you'll learn how to use ChatGPT (GPT4) with Code Interpreter to analyse 'big data' from the CAN bus of your vehicles/machinery.
By the end we expect you'll agree:
GPT4 with Code Interpreter is a game changer.
Why use ChatGPT for your CAN data analysis?
Our end users face a challenge:
They manage to collect tons of CAN/LIN data using the CANedge - but now have to extract insights across several devices, thousands of log files and months/years of data.
Most of our users are engineers, but not data scientists. And even if they are, statistical data analysis is very time consuming.
But what if AI could help?
Specifically, we investigate the practical usefulness of ChatGPT (GPT4) with Code Interpreter. In short, this lets you upload data and ask it to analyse the data via Python.
Our CAN bus dataset
For this evaluation, we use CAN/UDS data from a Kia EV6 electric car. The data was recorded with a CANedge incl. GPS/IMU and includes information on the EV battery, temperatures, position, speed, acceleration and more (see the summary).
Warning: Before you use GPT4 on your own CAN bus data you should review your company's policy on this.
How to enable Code Interpreter with GPT4
Below we outline how to set up GPT4 with Code Interpreter. Once done, you should see a small "+" button in your chat window:
1: Upgrade to ChatGPT Plus (20$/month)
2: Enable Code interpreter in settings
3: Select Code Interpreter in GPT4 chat
How to prepare your CANedge CAN bus data for GPT4
The devices log data on an 8-32 GB SD card and support optional WiFi, 3G/4G and GPS/IMU.
With the CANedge software/APIs, you can follow one of the below methods:
1: asammdf GUI
- Load your MF4 log file(s) in the asammdf GUI
- Optionally concatenate the data
- DBC decode the data
- Export the decoded data to CSV
2: Python API
- Download and modify our Python script
- Follow the README installation guide
- Add your own MF4 files and DBC file(s)
- Run the script to produce your CSV
For details on our data preparation and key considerations, see below.
Using asammdf is the simplest way to get started with your own data. In our case, however, we must use the Python API because our data contains multiframe UDS response data from the Kia EV6 (not currently supported by the asammdf GUI).
In theory it should be possible to allow GPT4 to perform the DBC decoding and data processing by providing it with the relevant Python packages in the form of *.whl files and some guidance. We did not test this particular step, however, as the purpose in this article is to focus on how competent GPT4 is when analysing a 'prepared' dataset.
Similarly, GPT4 can handle multiple CSV files rather than a single CSV file (as shown later). However, GPT4 achieves this by looping through the files and we expect it would increase the delay significantly between prompt responses, hindering the workflow. In addition, by performing the CSV concatenation prior to upload, we create a more consistent starting point for the analysis, while making it easier to double-check the results on our side.
When exporting DBC decoded CAN bus data to CSV, we typically distinguish between two types of data structures:
1: Signals in rows
Here, a 'Signal' column contains the name of each signal (e.g. Speed, RPM, ...). This lets you efficiently store signals with different timestamps and frequencies. This can be useful for CAN bus data where each CAN message generally has a unique timestamp/frequency. However, it also makes it cumbersome to e.g. calculate ratios between signals on-the-fly.
You can produce this CSV structure via our Python API.
2: Signals in columns
Here, each column represents a unique signal, all sharing the same timestamp (found in the index column). To represent signals across multiple CAN messages in this structure, one has to use resampling and/or extrapolation. For example, one could resample each signal to a shared 5-second frequency.
You can produce this structure via our Python API and the asammdf GUI.
Generally, the first method is superior from a technical perspective as you can minimize resampling/extrapolation, thus retaining the best data quality. For example, resampling a 100 Hz acceleration signal to a 5 second frequency results in this signal losing almost all meaning. Conversely, resampling an outdoor temperature to a 5 second frequency implies zero practical loss in quality. With the first data structure, you can address these considerations dynamically, tailoring your approach to each analysis. Further, for some use cases it is critical that data is analysed at the original frequency (e.g. in some automotive diagnostics use cases) and here this structure is recommended.
With that said, if you are going to analyse CAN data across e.g. 1 year, then you'll most likely need to do some level of resampling before exporting it to CSV (to avoid analysing e.g. 5-10 GB of data per device). For such a use case we generally recommend the second structure.
For our use case, we want to end up with a single CSV file (signals in columns) that has a consistent set of headers. This could be achieved by looping through all the log files and loading them into a single pandas dataframe, then saving this. Given that our data might exceed available RAM, this approach may not be scalable.
Instead, we append data to a CSV file that accumulates with each loop iteration. To do this, a list of signals needs to be defined up front which must be available in each dataframe. Without this, we cannot ensure that each value appended ends in the correct column of the CSV.
Basically, the signal list will be used to filter each dataframe (to ensure we do not get excess signals), as well as discard dataframes (if they do not include all required signals, e.g. because of a very short file, differences in configuration etc.).
In order to identify this list of signals, we take the following steps:
- Identify a log file that should be 'representative' of the majority of the data
- Extract and restructure the dataframe to have signals in columns
- List the column names using list(df_phys) and subsequently sort the list
- Use this list as the preliminary signals_required variable and filter all decoded logs
- Next, loop through all the data and extract the list of column names for each file
- If the column names of a file differs from the signals_required, print a warning and review the reason for the difference, then take action accordingly
Upon iterating, you might find that while 30% of the data aligns 95% with your 'signals_required', there are discrepancies, possibly due to 1-2 redundant signals. By storing the signals_extracted in a new list, you can quickly look for the unique elements between that and your signals_required via below line:
You then update your signals_required list to remove the redundant signals and repeat until you've arrived at a satisfactory signals_required list that also lets you work with e.g. 80-90% of your data. Of course, some use cases may require a different approach.
Before uploading the large CSV file to GPT4 it can be useful to do a quick review of the data.
For example, do some sanity checks by loading the file in Excel (if possible) - or create a smaller output file based on e.g. just 20 logs, then upload it to GPT4 for initial feedback.
For example, ChatGPT informed us that the signal VehicleSpeed had several missing values. This is a known problem with UDS responses from the Kia EV6 (and other Hyundai/Kia EVs) which is why we'd normally discard this parameter in favour of the Speed signal produced by the CANedge GPS module (which is consistently reliable). We therefore updated the output file to remove VehicleSpeed.
Having prepared our CAN bus data, we now have a 'decoded_ev6_data.csv' file of ~110 MB - ready for ChatGPT!
Get the CSV in our 'EV data pack'
Want to try out ChatGPT with the EV6 data yourself?
Our 100% free EV data pack includes the full ~110 MB decoded EV6 CSV file used in this article!
The pack also includes the 1 GB of raw CAN data, configuration file and DBC file for the EV6 - and other popular electric cars like the Tesla Model 3, Nissan Leaf, Enyaq and more!Get the data pack
12 awesome show cases of GPT4 + CAN bus data
Now comes the fun part: Analysing the EV6 CAN bus data with GPT4!
We start by uploading the CSV to get a summary:
As evident, GPT4 uses Code Interpreter to perform a high level summary of the data. This is not rocket science, but it's great for getting a quick overview and spotting potential issues. Note that all Python code can be viewed via the 'Show work' tabs.
In the following sections, we go through 12 show cases of how you can use ChatGPT with Code Interpreter.
In the following analyses, we adhere to the following restrictions:
- We do not use any coding knowledge in any of our prompts
- Each analysis was completed in <30 min (most in <10 min)
- All outputs you'll see are directly from ChatGPT (with no edits by us)
- We started a new 'chat session' for most of the examples to get a clean slate
- The analyses were done in August 2023 (ChatGPT August 3 version)
Example #1: Creating 10 insightful data visualisations
First we will try two approaches to visualising our data.
1.1: High level request
To get a feel for the data, we start by creating some quick data visualisations with the shown prompt.
ChatGPT proposes visualisations, creates the code and produces the requested 2 x 5 grid of visuals. Pretty cool!
However, the plot also illustrates a key problem:
Our prompt is too high level.
While GPT4 'delivers' on our request, the selected signals and charts are not that insightful (detailed workflow below).
Below is the full ChatGPT workflow (excl. the Python code generated), followed by the plot insights it provides.
As per the workflow, ChatGPT defines a list of tasks for each part of our prompt, then produces the code required to deliver the necessary output. The code it generates sometimes fails and ChatGPT then uses the error information to refine the code and try again. At one point we need to give it a 'nudge' to continue, but other than that it succeeds in finishing the task in the first attempt.
ChatGPT also provides a summary conclusion of each chart. The first iteration is generic and not useful. However, when we give ChatGPT this feedback it delivers far more relevant insights:
1.2: Specific request
In short, this 1st result was mediocre - but so was our prompt.
Let's try again with a more specific prompt as shown.
As seen in the resulting plot below, GPT4 does exactly what we ask - and the end result is far more useful.
Note that we only provide minor styling follow-ups - what you see is practically the 'first response' from ChatGPT.
Further, when prompting ChatGPT to provide 3 key takeaways it does a good job and e.g. highlights the relevant correlation between OutdoorTemperature and BatteryMaxTemperature (plots 2 and 3). It does, however, also hallucinate a bit as shown below.
Below is the full ChatGPT workflow followed by the plot insights it provides. As follow-up to our original prompt, we make a number of styling corrections and tweaks (e.g. converting Speed to SpeedKMH), but this is quickly handled by ChatGPT.
As shown, most of the insights make sense and are relevant.
Interestingly, however, GPT4 claims in the second insight that the EV6 StateOfHealth is deteriorating. That's a bold claim and seems inconsistent with the visual plot. After some probing, GPT4 admits that this was an error.
As we will see throughout other examples, this is an example of ChatGPT producing valid plots/numbers - but often invalid conclusions. Specifically, our hypothesis in this case is that ChatGPT 'solves' the insight aspect of our prompt based on its training data, rather than on the actual numbers - and it must be 'forced' to use the actual results in order to avoid this. In turn, this also begs the question as to whether the conclusions ChatGPT provided on the other plots are actually based on the data - or if they are just 'likely conclusions' given the context.
Example #2: Calculating kWh/100km
Next, let's test creating a new parameter.
When it comes to analysing the driving behaviour in electric cars, it is often of interest to look at kWh/100km as a measure of battery consumption. This is not directly available from the EV6 data, but has to be calculated.
As per our request, GPT4 calculates the battery consumption parameter correctly and visualises it. For fine-tuning, we ask GPT4 to remove outliers and exclude 1-hour periods that had less than 5 minutes of data in them.
This example may appear simple, but for CAN bus data analysis it is vital that ChatGPT is able to calculate custom signals on-the-fly. We'll use this ability frequently in subsequent analyses.
Example #3: Identifying correlating drivers of battery consumption
Now that we have BatteryConsumption let's analyse correlations between this and key parameters.
3.1: Correlation vs. select signals
As per the prompt shown, we start by explicitly informing ChatGPT which parameters we wish to evaluate.
Note that ChatGPT re-uses BatteryConsumption from our previous example, but now also has to calculate AvgTirePressure from the individual tire pressure signals of each wheel.
In response, ChatGPT correctly calculates the correlation coefficients and creates neat scatter plots as per our request (plots and coefficients are correct).
Correct plots, incorrect conclusions
Curiously, however, ChatGPT also comments on the results as requested (in the same response as the plot) - yet refers to incorrect correlation coefficients (complete with compatible - and thus incorrect - conclusions).
We point this out, after which ChatGPT again refers to incorrect correlation coefficients (but new ones vs. before).
This again illustrates how ChatGPT can be prone to hallucination when asked to derive insights from its own analyses.
We eventually resolve this by asking it to redo the analysis and 'think step-by-step' (more on this later), after which it produces both valid plots and conclusions.
Result 1: GPT4 visualizes the data correctly - but struggles with conclusions
Below is the detailed conversation. Note how ChatGPT again struggles with basing its conclusions on the actual data, instead generating arbitrary correlation coefficients. Importantly, this should not detract from the quality and usefulness of the actual plot produced (which is valid and easily replicated based on the code). However, it stresses again the need for caution when asking ChatGPT to interpret its own results.
3.2: Top 3 correlating signals
The correlations we picked above might not be the most relevant. Let's try asking ChatGPT to identify the top 3 correlations across the dataset (with some conditions).
In response, ChatGPT correctly identifies the top 3 correlations.
Initially, it lists TireTemperatureFrontLeft, OutdoorTemperature and TireTemperatureBackRight. This is technically correct, but not what we intended. Instead, we ask it to calculate AvgTireTemperature (and ignore the individual tires).
Additionally, ChatGPT incorrectly displays the absolute form of each correlation coefficient (both in plots and conclusions). This happens because absolutes are used for the ranking - and GPT4 forgets to revert to non-absolutes for the remaining analysis.
With these tweaks, ChatGPT ends up with AvgTireTemperature, OutdoorTemperature and IndoorTemperature as shown in the plot - and it provides a sensible conclusion as well (see below).
Result 2: After some corrections, we get the requested plot
Example #4: Visualising parameter relations in 3D surface plots
Next, let's try something simple and cool: 3D plots!open the 3D plot
GPT4 uses plotly to create the 3D surface map - neat!
Example #5: Creating geographical plots of the car's GPS data
The dataset also contains GPS positions recorded by the CANedge GPS/IMU module. To better understand the vehicle dynamics it can be useful to plot these positions on a map.
5.1: GPS positions as route
Let's start with a very basic route plot.
As evident, this initial prompt was simple and ChatGPT manages to deliver (using folium) without any issues.open the GPS plot
5.2: Speed-based geo heatmap
Let's try something harder:
Creating a geo-heatmap with speed-based coloring.
Here, ChatGPT breaks down our complex prompt into actionable subtasks and proceeds to execute each task.
Through this process, it corrects itself multiple times. At one point it also 'times out', but resumes when we ask it to 'continue'.
It successfully produces the requested plot, after which we follow-up with a few styling tweaks (e.g. asking it to remove circles with a radius beyond 50 km).
Note: GPT4 solved this task in <5 min!open the GPS plot
Result 1: An interactive map with visualised routes
Prompt 2: Create a geographical heatmap of positions and speed
Below is the full conversation. ChatGPT practically creates this in 'one go', except for the zoom level of the map.
Example #6: Identifying top 10 most frequently visited areas
Let us try another complex geospatial analysis.
A practical challenge encountered by many of our users is to evaluate the most frequently visited areas for a specific vehicle.
This is the most complex problem so far and importantly:
We initially don't know how to solve this problem.
6.1: Let ChatGPT figure it out
Since we do not have a clear way forward, we instead simply outline the intended end result - with zero guidance.
ChatGPT lays out a plan, but fails in defining areas.
Specifically, the chat bot attempts rather wild solutions such as DBSCAN clustering and KD-Trees. Both fail.
But ChatGPT then does something awesome:
It recognises that it's not getting anywhere, outlines two options - and asks for guidance!
This characterizes a great analyst - knowing when to ask for help and proposing viable routes forward.
Result 1: ChatGPT initially fails, but proposes a grid-based solution
Below is the conversation showing how ChatGPT tries to solve the task - but fails with the DBSCAN and KD-Tree approaches:
6.2: Use GPT4's idea with guidance
We ask GPT4 to proceed with the grid based approach and it then produces a valid result. However, the road to this is rather iterative so to get a cleaner starting point we redefine our original prompt in a new session as shown in the 2nd prompt.
ChatGPT now manages to create the intended result almost out-the-gate with mainly a few styling corrections.
We do make one 'significant' correction on #hours:
GPT4 correctly identifies that most observations have 5 seconds between them (as we've resampled the data to 5 seconds), but it extrapolates this assumption to hold true across all observations - which is incorrect. For example, the delta between the last observation of trip X and the first observation of trip Y can be minutes, hours or even days.
In its defense, GPT4 identifies and fixes this mistake when we point out: 'The #hours values seem lower than expected?'. After fixing this, it produces the exact intended result.open the GPS plot
Below is the detailed conversation. This is a relatively rare example of ChatGPT making an erroneous assumption in the Python code, specifically assuming that the time between observations is always 5 seconds (an extrapolation it makes based on part of the data).
One thing worth noting is that GPT4 struggles with adding e.g. city names, addresses or zip codes to each area. This is because it does not have access to geocoding packages and it is reluctant to use potentially outdated knowledge before the 2021 training data cutoff.
If pushed to provide a "best guess" as to the country of Area 1, it proposes Netherlands (the correct answer is Denmark, so it's not even close). However, this turns out to be an issue with GPT4 using only a single decimal in the latitude and longitude values.
After requesting that it leverages the full precision GPS coordinates, GPT4 does manage to map 8 of 10 areas to Denmark (Aarhus), which is correct - but fails to map two areas, specifically Denmark (Ebeltoft) and Austria (Kirchberg). For this reason we showcase the map without the partial geocoding result.
We expect Code Interpreter will at some point be able to fetch Python packages online, which will enable it to solve this type of task with ease.
Example #7: Visualising top 10 acceleration events
When analysing large amounts of data, you'll often be looking for specific events. To illustrate this, we will try to identify the top 10 acceleration events of the car as per the prompt shown.
Obviously, this is another complex request. We are deliberately asking ChatGPT to do a ton of stuff in one prompt as a challenge. Despite this it does pretty good on the first go.
As detailed in the full chat conversation below, the 1st prompt results mainly in events where the car starts after a standstill. In practice, we may be more interested in acceleration events during active driving, which motivates our follow-up prompt.
This follow-up prompt is not easy, but ChatGPT succeeds!
Fun fact: We don't prompt ChatGPT to add the dotted red 'event line' in Result 1 - it does so unprompted.open the GPS plot
Prompt 2: Add complex conditioning to the event identification
Result (3 of 3): Top 10 harsh acceleration events listed in a table structure
This conversation is fairly iterative with multiple adjustment prompts added in order to arrive at the final result.
In particular, we provide a few styling prompts and ask GPT4 to filter out events where ImuValid or SpeedValid equal 0. GPT4 struggles with labelling each event with a consistent 'event index' across all three outputs, but succeeds with some guidance. The results of one of the outputs is below (showing the event plot).
As evident from the results of the first prompt, many of the plots include missing observations because SpeedKMH is zero during part of the dataset. This is expected as the CANedge is configured to only record data when Speed exceeds zero. It also means that a lot of the acceleration events reflect the car starting up after a stand still.
To fix this, we make the follow-up prompt shown in 13 / 20 below. This follow-up prompt is non-trivial because of the iterative nature: ChatGPT has to both identify the top 10 acceleration events - while also ensuring that each event has 121 observations. It tries a few iterative approaches until it eventually decides on a rolling window methodology that achieves the task with sufficient efficiency.
As shown in the line charts, this method ensures in 9 of 10 plots that there is also 60 seconds of data before/after the actual event, even though that is not in principle a requirement in the initial filtration that GPT4 produces. As evident, one plot (Event 5) thus also lacks some data after the event, but seemingly makes the cut because it has sufficient data before the event to fulfil the requirement of 121 rolling observations.
After some minor styling prompts, we arrive at the final output - showing the top 10 harsh braking events along with the surrounding speed data.
Example #8: Daily distribution of consumed State of Charge (SoC%)
When working with CAN bus time series data you'll often need to work with the change in a variable, rather than the absolute value. However, doing so can be non-trivial for large amounts of data. Let's put ChatGPT to the test.
Our prompt for this task is very long, but with clear sub tasks.
In response, ChatGPT does a good job as per the below results - providing us with an overview of when the car is primarily used throughout the day and whether some hours of the day appear to enable more efficient driving than others.
Disclosure: As explained below, we made multiple attempts before arriving at this initial prompt.
For most of the examples in this article, we define our initial prompts in advance - and do not modify them through iteration. We do so to mimic real-life analysis, where you may not have time to perfectly engineer an initial prompt.
However, for this particular prompt we did go through about 3-4 iterations. Our original prompt was less specific on how to construct the calculated signals, e.g. not explicitly stating that the delta in DistanceTrip should exclude negative values. Without this guidance, ChatGPT would calculate a negative delta between the last observation of Trip A and the first observation of trip B, which is not our intent. Through iteration with ChatGPT we would identify unexpected values for the calculated signals and adjust our prompt accordingly.
Of course, this can be handled iteratively in the prompt conversation flow, but we decided to update the initial prompt instead to showcase how ChatGPT does succeed out-the-box if given sufficiently clear guidance on how to calculate each signal. This is also a reminder that even though ChatGPT is great, the user must do part of the thinking and scrutinise the results.
Example #9: Plotting State of Charge (SoC) over time excluding non-trips
A challenge when visualising data from vehicles over days or months is the fact that the data often consists of brief trips surrounded by a lot of nothing (when the vehicle is not driving).
To avoid wasting screen space on non-trip periods, let's try to cut them out as per our prompt.
ChatGPT produces the requested plot without any follow-ups required, providing a useful "zoomed in" perspective on the data that filters out all the non-trip gaps.
Example #10: Reviewing the quality of the GPS/IMU data
In order to showcase the ability of ChatGPT to derive high level insights from the data, we'll use it to help identify the validity of the GPS/IMU data we've recorded throughout.
Note how ChatGPT is tasked with deciding on the analysis in this case - although we do specify that it should focus on the validity signals. GPT4 arrives at the expected conclusion: That the signals are valid throughout almost the entire dataset.
Below is the summary conclusion made by ChatGPT:
If we do not specify which signals ChatGPT should focus on, it will attempt to identify this itself. It succeeds to some extent in our tests, but it will to a larger extent look at the actual signals (such as Speed, Latitude etc) rather than the more specific validity signals. One can argue whether this is preferable, but the result is an analysis that is more 'all over the place' - even if it arrives at the same conclusion overall.
As in other tests, providing a bit of extra guidance up-front results in vastly better output.
Example #11: Who's the better driver - Martin or Josefine?
Towards the end, we will try using ChatGPT to perform an analysis of driver performance for the car.
Specifically, the EV6 is used by two drivers: Martin and Josefine. We want to determine who drives most efficiently to/from work.
11.1: Clustering trips by driver
First, we need to cluster trips by driver. We do so through geo-tagging trips using the respective work places as per the prompt.
As evident from the below results, ChatGPT is able to segment the dataset into trips based on a simple logic we provide. It is also easily able to group the trips by driver based on whether the car moves into predefined geographical areas. We do some light filtration to focus only on work commutes and ask ChatGPT to summarize the trip information and plot it on a map.open the GPS plot
ChatGPT manages to perform the fairly complex trip clustering with ease and also provides summary statistics of the trips.
11.2: Visualising power consumption
Next, we use GPT4 to create a histogram of consumed power to compare regeneration patterns between the two drivers.
As evident, Josefine has a higher share of observations that fall into the light regenerative area compared to Martin.
Next, we'll derive a definitive conclusion.
Below is the workflow for generating the histogram. Note how ChatGPT wrongfully concludes that Martin is better at regenerative braking, which appears to contrast the results from the plot. More on this in the next section.
11.3: Accumulated consumption per km
To further evaluate the respective driver efficiency, we separately accumulate the negative and positive power consumption values and standardise them based on distance travelled (as shown in the prompt).
In response, ChatGPT does exactly what is requested, with only minor styling prompts required (results below).
As evident, the resulting plots clearly show that Josefine is a more efficient driver than Martin, specifically due to her ability to use regenerative braking better. This is also evident if we instead get ChatGPT to calculate battery consumption in kWh/100 km, where Josefine is almost 20% more efficient.
Valid plots, invalid conclusions
To cap it off, we ask ChatGPT a simple question: Who is the better driver and why?
And here it gets weird:
For all its amazing capabilities, ChatGPT wrongfully claims that Martin is the better driver.
This is a consistent theme across our examples:
While ChatGPT creates valid plot and analyses 9/10 times, it makes valid conclusions 5/10 times at best.
We comment on this in the chat workflow below.
Prompt 3B: Example of follow-up prompt
Prompt 3C: Calculate battery consumption per driver
Result 3 (2 of 2): Average battery consumption by driver
Below is the full workflow for the accumulated power consumption analysis.
As evident from the chat extract shown, ChatGPT appears to swap the names of the drivers when asked to provide a conclusion. Additionally, it only fixes this upon being informed of the error.
Interestingly, ChatGPT makes the same erroneous conclusion when reviewing the previous power consumption histograms, claiming that Martin was driving more efficiently. In fact, it even makes the same wrong conclusion when based on the extremely clear-cut battery consumption plot - generating new hallucinated consumption averages as part of its response.
As discussed previously, this appears to reflect GPT4 defaulting to its training data for conclusions, thus hallucinating 'likely' values for each driver instead of relying on the factual results it just produced.
If that is the case, then it is curious that ChatGPT consistently believes Martin is the better driver across sessions. Maybe this reflects an intrinsic bias in ChatGPT towards thinking men are better than women at driving. Or, maybe it thinks that the prompter is Martin (which is correct) and thus finds it likely that a conclusion that favorises the prompter is preferred. Regardless, it's surprising how frequently ChatGPT ignores the factual results and instead goes for hallucination.
When asked why it fails, ChatGPT appears to suggest that it's not leveraging the 'hard data' it produced in previous responses, but rather does a visual inspection of the plot.
It is unclear to us if ChatGPT actually tries to extract the numbers from the plot. We tested this ability in isolation, uploading the battery consumption plot in a new session and asking ChatGPT to extract the numbers and derive the insights. It explains that it will do so using Optical Character Recognition (OCR) and proceeds to extract five numbers (19.0, 17.5, 15.4, 15.0, 12.5 - i.e. including the two intended values). It fails in this test to make the intended conclusion as well. Based on this it appears GPT4 has some level of competence for plot based number extraction, but it is too sporadic to be reliable. As such, GPT4 may be telling the truth when it claims that it tries using the data from the plot - but it could also be that the numbers it uses are completely made up.
Regardless, when ChatGPT actually re-does the calculation to fetch the 'hard numbers' it correctly concludes that Josefine is the better driver.
Example #12: Analysing data from multiple vehicles
So far, we've used a single CSV from a single Kia EV6. In practice, our end users often need to extract insights across multiple vehicles, boats or machines - sometimes entire fleets.
12.1: Simulating fleet data
To mimic this, we first re-purpose our data to simulate a fleet of 5 cars (using GPT4, of course).
We originally intended to manually restructure the dataset by breaking it into 5 roughly equal periods and then re-baselining the timestamps to make it appear the data was recorded in parallel.
But then we of course came to our senses and asked ChatGPT to do it for us:
ChatGPT manages to do this without any issues, producing 5 separate chunks of data and re-baselining it to start January 1, 2023. It adds the car identifier in a new column in each dataset, creates 5 CSV files and provides a download link for the zip containing these 5 CSV files. Neat!
One minor critique is that it did not compress the data in the zip (but then again, we did not ask it to).
Generally, you could take many approaches to assigning a vehicle/machine ID to a dataset. It could be provided via the file name, folder name, an extra column etc. We find it useful to have the ID in the column as it offers some flexibility for also mixing multiple assets within the same dataset file. From our tests you just need to make it sufficiently explicit and ChatGPT will understand the logic - and if not, you can explicitly tell it.
The result is a new zip file containing 5 individual CSV files, each structured as before - but with the CSV name and an extra column specifying the vehicle identifier of each CSV.
12.2: Visualising fleet data
Next, we ask ChatGPT to perform a simple analysis based on the CSV files. As shown below, ChatGPT easily works across the 5 individual files to calculate the new signals and plotting all of them in a single visual - great! This also shows that you can work with multiple datasets in parallel if preferred.
We also ask for a plot of battery consumption (a complex calculated variable), which it also does without issues.
Prompt 2: Visualize battery consumption across fleet
Result 1: Distance travelled across the 5 separate cars
12.3: 3D scatter plot
For some light action, we ask for a 3D scatter plot.
ChatGPT again delivers (using plotly), creating a cool 3D scatter plot of battery consumption vs. outdoor temperature split by car. Note how it correctly applies the cross-plot color-coding.open the 3D plot
12.4: Car ranking and distance split
We also ask GPT4 to create a visual ranking of the cars based on driving efficiency, which it delivers with minor styling edits.
And as a final challenge, we ask it to revert from the hourly aggregated data back to the 5-second resolution (in-session) to calculate the distance travelled by each car and to visualise this in a pie chart. It does so correctly (with a few tweaks), despite having worked at weekly / hourly averages for the previous plots.
Overall, ChatGPT fluently handles the combined challenge of multiple CSV files, calculated variables and different sampling frequencies.
Prompt 5: Create a pie chart of distance travelled
Result 4: Ranking of cars based on battery consumption
10 tips & tricks for using GPT4 with your CAN data
Below we outline practical tips and tricks:
GPT4 will often 'drop out' of the session you're in, in which case it will lose access to the uploaded data. Rather than starting up a new chat session, the simplest fix is to reupload the relevant data and ask it to retry.
GPT4 often hallucinates which variables should be used in a calculation or visualisation. It'll typically realise the mistake and prompt you to inform it about the correct variable name, but to save time you can inform it upfront.
If you know programming, it's useful to open the code produced by GPT4 in each step. Sometimes there are some odd decisions made by GPT4 in developing the code. In addition, GPT4 often remakes the code from scratch upon e.g. new attempts - so it can be difficult to keep track at times. We find it useful to request GPT4 to output the full consolidated code when relevant. If you do this, make sure to also ask it to output the result of the code in the same prompt. You can then save this and run it locally in case you need to reproduce the exact results later.
While GPT4 is fast, you quickly get lost in the rabbit hole and hours can pass. To minimise this, be very clear on what you need to ask before you start prompting. In fact, we recommend writing down your full initial prompt for each question, ensuring you detail it to the extent you believe necessary to provide GPT4 with a chance to 'get it right' in the first go.
In our tests, ChatGPT will struggle with large complex prompts. You'll notice that we therefore frequently use the phrase "Let's think step-by-step", which has been shown to enable the large language model (LLM) to more easily handle complex tasks. Without this phrase, ChatGPT tends to proceed more iteratively and plunge into the task. With the phrase, it typically outlines each step clearly up-front as a "plan" - and then proceeds to execute it, resulting in far fewer mistakes.
If you're going to use ChatGPT for similar analyses, you may want to sanitise your data up front. Specifically, you may consider removing GPS data from your dataset before uploading, as this will in practical terms make your data anonymous and may also be required for GDPR or CCPA purposes. This can be done easily via the asammdf GUI filters when exporting to CSV - or by using our Python API to filter out specific signals.
You'll often need to re-upload your data e.g. if a session expires. To speed this up you can zip the CSV file to reduce size by 80%+. This also allows you to easily upload multiple CSV files and/or larger CSV files than the normal file limit.
In many of the examples in this article we deliberately challenge ChatGPT by providing extensive prompts with multiple sub tasks and multiple outputs. In practice, it is often simpler to take it step by step, e.g. asking ChatGPT to produce one output at a time and wait for feedback before proceeding.
If you know certain variables need to be calculated for use in many prompts, do this in advance. This helps you get more consistent results between prompts and you avoid the risk of ChatGPT occasionally calculating a new variable incorrectly. Similarly, consider what sampling frequency and data structure minimises the effort from ChatGPT in order to perform your required analyses.
In our experience, working with GPT4 plus Code Interpreter is like having a junior analyst that produces output at an inhuman rate. Yet, like most junior analysts you'll need to provide clear guidance up front and diligently review any analyses produced.
This does not mean that you need to know Python code - but you should ideally have an understanding of how to work with time series data to get the most out of this tool.
As we wrote the above tip, we wanted to exemplify this. Specifically, we believed ChatGPT would not proactively remove 'charging events' when calculating the 'consumed' state of charge over time. However, for the sake of good order we checked this - and we were wrong ...
Our verdict on GPT4 with Code Interpreter
In the spirit of this article, you'll find below both our verdict - and ChatGPT's verdict:
As should be obvious from the examples, our verdict is clear:
GPT4 with Code Interpreter is a game changer.
If an engineer with light/medium Python skills had to re-create all of the outputs in this article, we estimate it would take 20-30 hours of work. In comparison, the prompting process with ChatGPT took 2-3 hours - 90% less time.
More importantly: You may not know Python at all.
Here, ChatGPT lets you perform analyses that would otherwise be impossible. This makes it a key supplement to existing CANedge software for ad hoc analysis of medium/large datasets (across many files, months, years, devices, ...).
Also, we agree with the relevant pros and cons listed in the ChatGPT verdict [sigh, why are we even writing our own...].
GPT4 with Code Interpreter lets you produce amazing data analyses and visualisations with a ridiculous output/effort ratio. The Python based outputs are almost always valid - and can be easily reviewed. However, GPT4 is prone to hallucination when asked to comment on its own analyses - so make your own conclusions.
We hope you've found this article insightful!
If you want to play around with ChatGPT and Code Interpreter, download the 100% free EV6 CSV dataset now.
Tip: We may add a section with 'user examples' - if you have a prompt + output, share it on LinkedIn or via our contact form!
Ready to use ChatGPT with your own CAN data?
Get your CANedge today!