Eugenio Cano-Manuel Mendoza

Using SQLite to optimise my electricity usage

2021-04-10

I'm in the process of switching to Agile Octopus, an electricity tariff that ties the price I pay with the wholesale price, updated every 30 minutes. The reason I've been looking forward to switching to a tariff as dynamic as this one is to restrict my electric boiler to only run during the cheapest times of the day.

My boiler is a direct unvented cylinder. In Layman's terms this is a big cyclinder that heats and stores water. In a perfect world all the water stored in the heater would not lose any heat over time when hot water is not being used, but this is a pretty old boiler so it's not really efficient. Leaving it "on" the whole day can actually use a lot of energy as it tries to compensate for loss of heat at random times.

I got a smart meter installed at the beginning of March (a prerequisite for getting Agile). This also lets me track my usage on a half-hourly basis. I decided to run a few experiments with the boiler: switching it off for some 24-hour periods, switching it on only during a 2-hour window, moving the window around etc. This was mostly to get a sense of how much power it needs on a daily basis.

My guesstimate is that the boiler needs to run for about 2 hours at 3kW. My daily consumption is about 13kWh, so the boiler accounts for almost half of that (2h * 3kW = 6kWh).

I found a website where I can download the historical pricing data for Agile in London. It's basically a CSV file with the 30 minute window of the day and the price (pence per kW) for that window. The data looks like this:

...
2021-04-10T18:00:00Z,19:00,C,London,16.59
2021-04-10T18:30:00Z,19:30,C,London,15.057
2021-04-10T19:00:00Z,20:00,C,London,16.296
2021-04-10T19:30:00Z,20:30,C,London,14.7
2021-04-10T20:00:00Z,21:00,C,London,14.07
2021-04-10T20:30:00Z,21:30,C,London,12.369
2021-04-10T21:00:00Z,22:00,C,London,12.915
2021-04-10T21:30:00Z,22:30,C,London,12.6
...

The question I need to answer is simple: What's, on average, the cheapest 2-hour window?

It would be fairly straightforward to write a small program to calculte this, but I wondered if there was a simple way to do it without writing any code. Surely a spreadsheet could calculate it (somehow) but I was thinking I could probably do this with command-line tools only.

I decided to try solving this using SQLite. This was my first time using it so it took me a while to read through the documentation. Granted I could have finished earlier had I written a small Python script but I figured I only had to pay for this overhead once, and who knows, maybe knowing how to process CSV files in SQLite may turn useful in the future.

The sqlite3 command-line program is easy to use and can automatically ingest CSV files into a table. The schema is determined from the header of the CSV file, which I had to add manually but no big deal.

sqlite> .mode csv
sqlite> .import csv_agile_C_London.csv agile
sqlite> .schema agile
CREATE TABLE IF NOT EXISTS "agile"(
  "time_utc" TEXT,
  "time_london" TEXT,
  "area_code" TEXT,
  "area_name" TEXT,
  "unit_price" TEXT
);

Magic.

Now the query. Let's get the average price per 30-minute window for 6 months and store it in a view.

sqlite> CREATE VIEW avg_by_period AS SELECT TIME(time_utc) slot, AVG(unit_price) price FROM agile WHERE time_utc BETWEEN "2020-10-01" AND "2021-03-01" GROUP BY 1 ORDER BY 1 ASC;
sqlite> SELECT * FROM avg_by_period;
00:00:00,8.30654304635761
00:30:00,9.10941059602649
01:00:00,8.94801655629139
01:30:00,8.07373509933775
02:00:00,8.2191357615894
02:30:00,7.46091059602649
03:00:00,7.83877152317881
03:30:00,6.97457284768212
04:00:00,7.61611589403974
04:30:00,7.30459271523179
05:00:00,8.10739072847682
05:30:00,8.60464569536424
06:00:00,9.55020198675496
06:30:00,9.99593046357616
07:00:00,9.84503642384106
07:30:00,11.3778278145695
08:00:00,10.9571324503311
08:30:00,12.1820860927152
09:00:00,12.0985033112583
09:30:00,12.2710927152318
10:00:00,12.088142384106
10:30:00,11.9168046357616
11:00:00,11.7900397350993
11:30:00,12.0390496688742
12:00:00,12.3426456953642
12:30:00,12.3272781456954
13:00:00,12.3196291390728
13:30:00,11.4825496688742
14:00:00,11.3758112582781
14:30:00,11.0189503311258
15:00:00,12.5045960264901
15:30:00,13.8876754966887
16:00:00,24.7437715231788
16:30:00,28.9192947019868
17:00:00,29.8375927152318
17:30:00,30.551940397351
18:00:00,27.3357417218543
18:30:00,24.6017086092715
19:00:00,14.0497649006622
19:30:00,11.5805264900662
20:00:00,11.6012483443709
20:30:00,10.3603708609272
21:00:00,11.0939105960265
21:30:00,9.63385430463576
22:00:00,10.1794370860927
22:30:00,8.59692715231788
23:00:00,9.52370860927153
23:30:00,8.60109933774834

It looks like the cheapest time, on average, is the 3:30 slot. But we're looking for the cheapest 2-hour window. We can do this with a subquery:

sqlite> SELECT slot, (SELECT SUM(price) FROM avg_by_period WHERE slot IN (a.slot, TIME(a.slot,'+0.5 hours'), TIME(a.slot,'+1 hours'), TIME(a.slot,'+1.5 hours'))) from avg_by_period a;
00:00:00,34.4377052980132
00:30:00,34.350298013245
01:00:00,32.701798013245
01:30:00,31.5925529801324
02:00:00,30.4933907284768
02:30:00,29.8903708609271
03:00:00,29.7340529801324
03:30:00,30.0026721854305
04:00:00,31.6327450331126
04:30:00,33.5668311258278
05:00:00,36.2581688741722
05:30:00,37.9958145695364
06:00:00,40.7689966887417
06:30:00,42.1759271523179
07:00:00,44.3620827814569
07:30:00,46.6155496688742
08:00:00,47.5088145695364
08:30:00,48.6398245033113
09:00:00,48.3745430463576
09:30:00,48.0660794701987
10:00:00,47.8340364238411
10:30:00,48.0885397350993
11:00:00,48.4990132450331
11:30:00,49.0286026490066
12:00:00,48.4721026490066
12:30:00,47.5052682119205
13:00:00,46.196940397351
13:30:00,46.3819072847682
14:00:00,48.7870331125828
14:30:00,62.1549933774835
15:00:00,80.0553377483444
15:30:00,97.3883344370861
16:00:00,114.052599337748
16:30:00,116.644569536424
17:00:00,112.326983443709
17:30:00,96.5391556291391
18:00:00,77.5677417218543
18:30:00,61.8332483443709
19:00:00,47.5919105960265
19:30:00,44.6360562913907
20:00:00,42.6893841059603
20:30:00,41.2675728476821
21:00:00,39.5041291390728
21:30:00,37.9339271523179
22:00:00,36.9011721854305
22:30:00,35.0282781456954
23:00:00,35.540761589404
23:30:00,34.9650695364238

I'm using IN and not BETWEEN so that the query works for the periods between 22:30 and 23:30, as the offset makes the time wrap around the day.

It looks like the cheapest 2-hour window is the one that starts at 3:00.

I will probably set the boiler to run at 4:00 instead, as the difference is not great and it brings it closer to "normal" hours, when I'll be using hot water.

SQLite was fun to use, I'm impressed by how quickly you can get queryable data from CSV files.