Introduction

It is counter-intuitive that making a storage field larger than it needs to be can sometimes save you not only gas at runtime, but also reduce the contract deployment bytesize, but it’s true! In this post we’re going to demonstrate why this can be the case and look at the tools you can use to prove it.

Storage Slot Packing Recap

It’s pretty standard stuff to pack fields sensibly into storage slots, but this is essential background to the sections that follow. Therefore, we’re going to do a quick recap of storage slot packing in this section, and review some different tooling you can use to review opcodes and measure gas. You can skip this section if you’re already up to speed and go straight to Packing A Spare Field.

EVM Persistent Storage

EVM persistent storage is read and written in words, where each word is 32 bytes. Each word is assigned a single storage slot. Each slot can contain one or more fields. When a slot is touched it is considered “warm” for that transaction, which reduces gas fees for all interactions with all the fields in that slot.

When we get down to the opcode level, we’re going to be looking for opcodes SSTORE to write data and SLOAD to read data. The SSTORE opcode in particular is very gas intensive, which is fair enough because we’re writing data to every node in the underlying network. When you first write non-zero data to a field you’re looking at 22100 gas to pay for it.

Let’s take a very simple example contract, one that only has storage fields:

contract Storage1 {
    address owner_20;
    uint240 large_30;
    uint96 small_12;
    uint8 tiny_01;
    bool flag_01;
}

The field names include the length of the field in bytes as a postfix. This is just for our convenience, it lets us quickly see sizes in a common unit of measure. Notice that the boolean of two possible values takes up a whole byte, but that’s just the way it is.

We can see how storage is organised like this:

$ forge inspect Storage1 storageLayout
╭----------+---------+------+--------+-------+--------------------------╮
| Name     | Type    | Slot | Offset | Bytes | Contract                 |
+=======================================================================+
| owner_20 | address | 0    | 0      | 20    | src/Storage.sol:Storage1 |
|----------+---------+------+--------+-------+--------------------------|
| large_30 | uint240 | 1    | 0      | 30    | src/Storage.sol:Storage1 |
|----------+---------+------+--------+-------+--------------------------|
| small_12 | uint96  | 2    | 0      | 12    | src/Storage.sol:Storage1 |
|----------+---------+------+--------+-------+--------------------------|
| tiny_01  | uint8   | 2    | 12     | 1     | src/Storage.sol:Storage1 |
|----------+---------+------+--------+-------+--------------------------|
| flag_01  | bool    | 2    | 13     | 1     | src/Storage.sol:Storage1 |
╰----------+---------+------+--------+-------+--------------------------╯

The compiler respects the order in which we declared the variables in our source code, and tries to fit them into a linear series of 32-byte slots. The above shows our contract uses 3 storage slots. The first (slot[0]) contains the address, the second (slot[1]) contains the large integer field, and the remaining fields are all packed into the third slot (slot[2]).

We have some wasted space, though. For example, we can see slot[0] just contains the 20 bytes of an address, which means 12 bytes are left empty in that slot.

Reordering Fields

We can save a whole slot by re-ordering our variables. Let’s see what happens if we swap over large_30 and small_12:

contract Storage2 {
    address owner_20;
    uint96 small_12;
    uint240 large_30;
    uint8 tiny_01;
    bool flag_01;
}

$ forge inspect Storage2 storageLayout
╭----------+---------+------+--------+-------+--------------------------╮
| Name     | Type    | Slot | Offset | Bytes | Contract                 |
+=======================================================================+
| owner_20 | address | 0    | 0      | 20    | src/Storage.sol:Storage2 |
|----------+---------+------+--------+-------+--------------------------|
| small_12 | uint96  | 0    | 20     | 12    | src/Storage.sol:Storage2 |
|----------+---------+------+--------+-------+--------------------------|
| large_30 | uint240 | 1    | 0      | 30    | src/Storage.sol:Storage2 |
|----------+---------+------+--------+-------+--------------------------|
| tiny_01  | uint8   | 1    | 30     | 1     | src/Storage.sol:Storage2 |
|----------+---------+------+--------+----------------------------------|
| flag_01  | bool    | 1    | 31     | 1     | src/Storage.sol:Storage2 |
╰----------+---------+------+--------+-------+--------------------------╯

Awesome, now we’ve neatly filled each slot with 32 bytes. As a result, we’ve saved ourselves a whole slot! Why is this good? Well, it’s going to save us gas, let’s see how much.

Measuring Gas Usage for Storage

To measure gas usage for storage, we’re going to have to enhance our contracts a bit, because at present they’ve got a bunch of fields with the default internal visibility and no way to update them. So we can’t read or write to storage the way the contracts are just now. Lets fix that:

contract Storage1 {
    address public owner_20;
    uint240 public large_30;
    uint96 public small_12;
    uint8 public tiny_01;
    bool public flag_01;

    function setAllFields() external {
        owner_20 = msg.sender;
        large_30 = 1;
        small_12 = 2;
        tiny_01 = 3;
        flag_01 = true;
    }
}

contract Storage2 {
    address public owner_20;
    uint96 public small_12;
    uint240 public large_30;
    uint8 public tiny_01;
    bool public flag_01;

    function setAllFields() external {
        owner_20 = msg.sender;
        large_30 = 1;
        small_12 = 2;
        tiny_01 = 3;
        flag_01 = true;
    }
}

You can see from above we’ve made the fields public so that getter functions are automatically generated. We’ve also added a function to set all the fields to a non-zero value. We’re going to measure the gas usage of calling the function that writes all 5 fields at once. We’re going to do this in three different ways and compare the results.

Method 1 - Forge Test Gas Report

Ok let’s throw in a simple test and then we’ll run it with gas reporting enabled:

contract Storage1And2Test is Test {
    Storage1 public storageContract1;
    Storage2 public storageContract2;

    function setUp() public {
        storageContract1 = new Storage1();
        storageContract2 = new Storage2();
    }

    function testStorage1SetAllFields() public {
        storageContract1.setAllFields();
    }

    function testStorage2SetAllFields() public {
        storageContract2.setAllFields();
    }
}

And then the test output with gas reporting looks like this:

$ forge test --mt Storage1 --gas-report
...
╭--------------------------+-----------------+-------+--------+-------+---------╮
| src/Storage.sol:Storage1 |                 |       |        |       |         |
+===============================================================================+
| Deployment Cost          | Deployment Size |       |        |       |         |
|--------------------------+-----------------+-------+--------+-------+---------|
| 147163                   | 461             |       |        |       |         |
|--------------------------+-----------------+-------+--------+-------+---------|
| Function Name            | Min             | Avg   | Median | Max   | # Calls |
|--------------------------+-----------------+-------+--------+-------+---------|
| setAllFields             | 87681           | 87681 | 87681  | 87681 | 1       |
╰--------------------------+-----------------+-------+--------+-------+---------╯

The above shows us that Storage1 has a deploy bytesize of 461 and function setAllFields() used 87681 gas.

Method 2 - Use EVM Codes Site

The awesome EVM Codes site easily lets us see down to the opcode level and see total gas for a function call. First let’s get the bytecode for our first contract:

$ forge inspect Storage1 deployedBytecode
0x6080806040526004361015610012575f80fd5b5f3560e01c908163261cec961461015b5750806326e1ecf214610138578063ac56bff114610113578063b2b9b1ea146100eb578063c9544acc1461008d5763e3bdb95a1461005e575f80fd5b34610089575f3660031901126100895760206bffffffffffffffffffffffff60025416604051908152f35b5f80fd5b34610089575f366003190112610089575f8054336001600160a01b0319909116179055600180546001600160f01b03191681179055600280546dffffffffffffffffffffffffffff19166d0103000000000000000000000002179055005b34610089575f366003190112610089576001546040516001600160f01b039091168152602090f35b34610089575f36600319011261008957602060ff60025460681c166040519015158152f35b34610089575f36600319011261008957602060ff60025460601c16604051908152f35b34610089575f366003190112610089575f546001600160a01b03168152602090f3fea264697066735822122051e1c6c214049c5fc2b44b30f22705bfc9c1bc7aba1ead8adc9860b50300e35c64736f6c634300081e0033

We also need the data for the function call:

$ cast calldata "setAllFields()" ""
0xc9544acc

Armed with the bytecode and calldata, we go to EVM Codes playground and paste both of these items in and click “run”. We then get the opcode list for the Storage1 contract:

EVM Codes Playground: Opcodes for Storage1 Contract

We clicked “run”, but notice how the site doesn’t execute the whole transaction. It just brings up the opcode list and allows us to then step through the opcodes one at a time. To complete execution of the transaction and see the gas consumed we have to click “continue execution”:

EVM Codes Playground: Continue Execution

Finally, we can see the gas used is 87681:

While we’re here, we can dig into the opcodes a bit further. If you select all the opcodes and paste them into a text editor, and then count how many SSTORE opcodes there are, how many do you get? You should get 3. That’s because we write to 3 storage slots. Each slot written costs one SSTORE operation.

Bonus Method: Note that we could get a similar result of seeing the opcodes using the decompiler from the dedaub site, just as we did in this earlier post.

Method 3 - Forge Test With Verbose Logs

We can also see gas consumed from the most verbose logs of a forge test run.

$ forge test --mt Storage1 -vvvvv
...
Traces:
...
  [71917] Storage1And2Test::testStorage1SetAllFields()
    ├─ [66617] Storage1::setAllFields()
    │   ├─  storage changes:
    │   │   @ 2: 0 → 0x0000000000000000000000000000000000000103000000000000000000000002
    │   │   @ 0: 0 → 0x0000000000000000000000007fa9385be102ac3eac297483dd6233d62b3e1496
    │   │   @ 1: 0 → 1
    │   └─ ← [Stop]
    └─ ← [Stop]

We can see from above that the function setAllFields() cost 66617 gas, and we can see under the section storage changes: that there were three storage slots written to (numbered @ 2, @ 0, @ 1 in the log).

Comparing Measurements

We can repeat the above measurement methods 1 to 3 for the contract Storage2, and collate the results like so:

Measurement	Contract: Storage1	Contract: Storage2
Deploy Bytecode Size (Bytes)	461	391
Gas cost for setAllFields() [Method 1]	87681	65515
Gas cost for setAllFields() [Method 2]	87681	65515
Gas cost for setAllFields() [Method 3]	66617	44451
Count of SSTOREs in Opcodes	3	2

We can see that the Storage2 contract was over 20000 gas cheaper, all because we saved writing to a storage slot. It also meant a small contract deploy size, since we needed fewer opcodes (one less SSTORE).

The figures for method 3 do stand out, though. Why do the gas usage numbers for forge test ... -vvvvv (method 3) disagree with the other methods? The answer is that methods 1 and 2 include the overhead costs of a full transaction, as it were being called from an external client. Method 3 gives the cost only of the function call, without the transaction “wrapper”. The overhead of the transaction is a flat 21000 gas plus 64 gas for the function selector (16 gas * 4 bytes). It would be higher if we’d passed more calldata.

Packing A Spare Field

Now let’s see what happens when we need to pack in one more field. Let’s say we have another uint8 that needs added. Since our storage is already perfectly packed into 32-byte words, we will just plonk it on the end:

contract Storage3 {
    address owner_20;
    uint96 small_12;
    uint240 large_30;
    uint8 tiny_01;
    bool flag_01;
    uint8 public anotherTiny_01;  // <-- the new field
}

We’ve also removed the other getters, and removed the setter function to simplify. Let’s compare this contract’s deploy size and gas usage with another where we make the new field a uint256 (even though our business requirements only need a uint8):

contract Storage4 {
    address owner_20;
    uint96 small_12;
    uint240 large_30;
    uint8 tiny_01;
    bool flag_01;
    uint256 public anotherTiny_01;  // <-- the new field, made larger
}

Measuring Gas Usage

Now let’s add a super simple test to just read the field in each case, and see which getter function for the new field anotherTiny_01 consumes the least gas.

contract Storage3And4Test is Test {
    Storage3 public storageContract3;
    Storage4 public storageContract4;

    function setUp() public {
        storageContract3 = new Storage3();
        storageContract4 = new Storage4();
    }

    function testStorage3And4() public view {
        assertEq(storageContract3.anotherTiny_01(), 0);
        assertEq(storageContract4.anotherTiny_01(), 0);
    }    
}

$ forge test --mt Storage3And4 --gas-report
...
╭---------------------------+-----------------+------+--------+------+---------╮
| src/Storage.sol:Storage3  |                 |      |        |      |         |
+==============================================================================+
| Deployment Cost           | Deployment Size |      |        |      |         |
|---------------------------+-----------------+------+--------+------+---------|
| 79371                     | 144             |      |        |      |         |
|---------------------------+-----------------+------+--------+------+---------|
| Function Name             | Min             | Avg  | Median | Max  | # Calls |
|---------------------------+-----------------+------+--------+------+---------|
| anotherTiny_01            | 2248            | 2248 | 2248   | 2248 | 1       |
╰---------------------------+-----------------+------+--------+------+---------╯

╭---------------------------+-----------------+------+--------+------+---------╮
| src/Storage.sol:Storage4  |                 |      |        |      |         |
+==============================================================================+
| Deployment Cost           | Deployment Size |      |        |      |         |
|---------------------------+-----------------+------+--------+------+---------|
| 78723                     | 141             |      |        |      |         |
|---------------------------+-----------------+------+--------+------+---------|
| Function Name             | Min             | Avg  | Median | Max  | # Calls |
|---------------------------+-----------------+------+--------+------+---------|
| anotherTiny_01            | 2242            | 2242 | 2242   | 2242 | 1       |
╰---------------------------+-----------------+------+--------+------+---------╯

The above shows the surprising result, that not only is the uint256 storage variable giving us a smaller contract deployment size (141 versus 144 bytes), it is also consuming less gas when we read it (2242 versus 2248 gas)!

How can this be? How can making a field larger be more efficient? To answer that, we need to look into what opcodes are being used.

Opcode Analysis

We get the deployed bytecode for contract Storage3 like this:

$ forge inspect Storage3 deployedBytecode
0x60808060405260043610156011575f80fd5b5f3560e01c63801147b7146023575f80fd5b34603e575f366003190112603e5760209060ff600254168152f35b5f80fdfea2646970667358221220450ff733407af87c2e1ce4d218c1c4712e7e982f7db6753aeb65e5d71f90a50664736f6c634300081e0033

We get the information we need for the calldata field like this:

$ cast calldata "anotherTiny_01()" ""
0x801147b7

And again we can use the EVM Codes site and paste in the deployed bytecode and the calldata of 0x801147b7 to see the opcodes. These are the same steps we took in this earlier section to look at gas usage. We then repeat above steps for contract Storage4 and we can compare how the source opcodes look.

The reason for the difference lies near the end of the contract opcode list. To orientate ourselves in the code, we’re looking for an SLOAD opcode (which is us reading a value from storage) near to a RETURN opcode (which is the function returning the value). Let’s see both listings, with annotations, then I’ll explain more:

Opcode List Extract for Storage3

[32] PUSH1 20   <-- Prepare offset in stack, we'll return this later
[34] SWAP1      <-- Arrange stack
[35] PUSH1 ff   <-- [EXTRA] Prepare a mask of 1 byte
[37] PUSH1 02   <-- Push 2 to stack (representing storage slot #2)
[39] SLOAD      <-- Load storage slot #2 (the field anotherTiny_01)
[3a] AND        <-- [EXTRA] Combine mask with SLOADed data, so we read 1st byte
[3b] DUP2       <-- Duplicate the offset because MSTORE will consume it
[3c] MSTORE     <-- Copy to memory 
[3d] RETURN     <-- Return memory as function return value

Opcode List Extract for Storage4

[32] PUSH1 20   <-- Prepare offset in stack, we'll return this later
[34] SWAP1      <-- Arrange stack
[35] PUSH1 02   <-- Push 2 (representing storage slot #2) to stack
[37] SLOAD      <-- Load storage slot #2 (the field anotherTiny_01)
[38] DUP2       <-- Duplicate the offset because MSTORE will consume it
[39] MSTORE     <-- Copy to memory 
[3a] RETURN     <-- Return memory as function return value

What the above shows is that the Storage3 contract has two extra opcodes, annotated with [EXTRA]. Remember that slots are read in whole 32-byte words, even if you touch only a small part of that word. What these additional opcodes are doing is masking the 32-bytes returned by the SLOAD such that only the 1 byte of our field anotherTiny_01 is allowed to “shine through”. Since the ABI returns a 32-byte word anyway, this overhead of chopping out just the 1 byte we use, is using more gas than we need.

In contract Storage4 we’re saving two opcode operations, because we just read out and return the whole 32-byte word from storage. This is why we save deployed bytesize and gas!

There is a small downside, though. If we make our field uint256 instead of uint8 we’re losing some degree of type safety, because perhaps we’d like the EVM to revert if our value gets too big.

Nevertheless, it’s an interesting micro-optimisation and sheds some light on EVM internals and the tools used to examine it.

Share on

Twitter Facebook LinkedIn

Gas Micro-Optimisation using Storage

Kevin Small