What is Input Malleability?

When you call a Solidity function with a dynamic argument like an array or bytes, the ABI encoder represents that argument using an offset and a length, not in-place data. This design allows multiple distinct calldata layouts to decode to the same input values, which will perform the same function execution steps, even though their raw bytes differ. This subtlety of the Solidity ABI is known as input malleability, and it has consequences for logic that relies on parsing calldata manually. This article explains in detail how this works, and the potential consequences of forgetting about it.

We will look at some examples of calldata, decode them with tools and then manually, and get a good understanding of what is going on “under the hood”.

Example Calldata

Here is an example of two distinct calldata strings that decode to produce the same function execution steps. It is not possible to decode calldata reliably without knowing the function signature. To get the function signature we need source code, the ABI, or we can rely on a service like 4byte.directory. In the below we’re going to use Foundry’s cast 4byte-decode.

Calldata A

Consider the calldata below. Let’s see what Solidity function it will execute and with what parameters:

0x1cff79cd0000000000000000000000001240fa2a84dd9157a0e76b5cfe98b1d52268b26400000000000000000000000000000000000000000000000000000000000000400000000000000000000000000000000000000000000000000000000000000064d9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001

We can use cast to decode the above:

$ cast 4byte-decode 0x1cff79cd0000000000000000000000001240fa2a84dd9157a0e76b5cfe98b1d52268b26400000000000000000000000000000000000000000000000000000000000000400000000000000000000000000000000000000000000000000000000000000064d9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001
1) "execute(address,bytes)"
0x1240FA2A84dd9157a0e76B5Cfe98B1d52268B264
0xd9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001

We can see from above that Calldata A will call a function called execute that expects an address and a bytes array. The above shows the address is 0x1240FA..64 and the bytes array is 0xd9caed..01.

Calldata B

Here is another calldata string, different from Calldata A. We will examine the differences later, but for now let’s see what Solidity function it will execute, and with what parameters:

0x1cff79cd0000000000000000000000001240fa2a84dd9157a0e76b5cfe98b1d52268b2640000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000064d9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001

Again we use cast to decode this:

$ cast 4byte-decode 0x1cff79cd0000000000000000000000001240fa2a84dd9157a0e76b5cfe98b1d52268b2640000000000000000000000000000000000000000000000000000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000064d9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001
1) "execute(address,bytes)"
0x1240FA2A84dd9157a0e76B5Cfe98B1d52268B264
0xd9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001

The above shows that, again, we are calling a function execute. Looking more closely, we can see that we’re calling that function with the same parameters and values as we had with Calldata A. The first parameter is an address 0x1240FA..64 and the bytes array is 0xd9caed..01. The bytes array is of length 100 bytes, identical to the bytes array from Calldata A.

So, two different calldata strings, yet the same function call!

Manual Decoding

We’re now going to perform some manual decoding, which will reveal why the two different calldata strings produce the same results. There is a very thorough article that goes into ABI coding and decoding in more detail than we need for our case. It is worth a read if you’re dealing with more complex function parameters such as nested structs.

Manual Decoding of Calldata A

It is pretty easy to manually decode calldata using a text editor to rearrange the data. We can paste the raw Calldata A into a text editor and as a first pass we can hit enter a few times to chop it into logical pieces:

0x
1cff79cd
0000000000000000000000001240fa2a84dd9157a0e76b5cfe98b1d52268b264
0000000000000000000000000000000000000000000000000000000000000040
0000000000000000000000000000000000000000000000000000000000000064
d9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001

Annotating the above, we get:

Manually Decoding Calldata A: arranging data in text editor
Manually Decoding Calldata A: arranging data in text editor

In the above, the blue fields are always present. The calldata always starts with a function selector. Most fields after that will be treated as 32-byte words (64 chars in our text editor) but not all, as we shall see. In this example the function selector is 1cff79cd which tells us the function signature is execute(address,bytes). We know this because we saw it from the cast command above or we can check it from the 4byte.directory service. The yellow fields depend on the function signature, but there is always going to be some arrangement of 32-byte words somewhere.

Given that we know the function signature is execute(address,bytes), we can press enter a couple more times to separate the function parameters more clearly:

Manually Decoding Calldata A: interpreting the fields
Manually Decoding Calldata A: interpreting the fields

In the above, we can see the address parameter is very simple, it is padded to 32 bytes and is stored in place. The bytes parameter is dynamic and is not stored in place like the address, and this is the heart of the malleability feature. The bytes parameter is stored in three parts: the offset, the length and the actual data:

  • offset: 32-byte word containing a pointer to where the length field starts. The offset is measured in bytes, counted from the start of the calldata field, but without the selector part. This “offset start point” is shown as the blue arrow in the above diagram. In this case, the offset value is 0x40 or 64 bytes, which measured from the blue arrow takes us directly to the next field, the length field, with no gaps.
  • length: 32-byte word containing the number of bytes in the bytes array. The offset points to this length field. The actual data follows immediately after this field.
  • actual data: the actual data, as one long array of bytes.

This result matches what we saw above when we decoded the calldata using cast. The first parameter is an address 0x1240FA..64 and the bytes array is 0xd9caed..01, of length 100 bytes. There are no surprises here, we’re just learning how to manually decode the calldata.

Manual Decoding of Calldata B

Now we can perform the same steps again, and see what results we get when we manually decode Calldata B. We paste Calldata B into a text editor and press enter a few times:

0x
1cff79cd
0000000000000000000000001240fa2a84dd9157a0e76b5cfe98b1d52268b264
0000000000000000000000000000000000000000000000000000000000000080
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000064
d9caed120000000000000000000000008ad159a275aee56fb2334dbb69036e9c7bacee9b00000000000000000000000044e97af4418b7a17aabd8090bea0a471a366305c0000000000000000000000000000000000000000000000000000000000000001

Clearly, the above calldata is longer than the Calldata A. Annotating the above, we get:

Manually Decoding Calldata B: interpreting the fields
Manually Decoding Calldata B: interpreting the fields

The above diagram shows the address parameter is the same as for Calldata A. However, the bytes parameter parts are subtly different:

  • offset: 32-byte word containing a pointer to where the length field starts. The offset is measured in bytes, counted from the start of the calldata field, but without the selector part. This “offset start point” is shown as the blue arrow in the above diagram. In this case the offset value is 0x80 or 128 bytes, which measured from the blue arrow takes us to the next field, the length field, but with a 64 byte gap inbetween!
  • length: Identical to Calldata A.
  • actual data: Identical to Calldata A.

The blue highlighted 64 bytes in the above diagram is just redundant padding. These bytes are not used in this example, and can contain any data. They are there because the offset field points “further ahead” than in the Calldata A example, and we need to fill that gap.

The effect of the above Calldata B, in terms of what the EVM will do when the calldata is executed, is identical to the result of Calldata A. The two calldata strings have the same functional intent.

Why should I care?

To summarise up to this point: we have two distinct calldata strings, Calldata A and Calldata B that produce the same functional intent, which is to call a function called execute with the same paramaters. We’ve verified this with cast and by manual decoding. We’ve learned Calldata B is a bit longer, and has some padding and unused data, but if the calldata produces the same functional effects, who cares?

The answer to that lies with functions that read raw calldata. If the execute() function reads the parameters as normal, there is not an issue:

function execute(address target, bytes calldata actionData) external {
  // If code only reads target and actionData parameters then no problem
  // ...
}

However:

  • if the execute() function reads raw calldata directly…
  • …and the function makes assumptions about the positioning of data within that raw calldata…
  • …then there is a potential problem.

Let’s look at an example execute() function that shows this potential problem:

function execute(address target, bytes calldata actionData) external {
  // Read the 4-bytes selector at the beginning of `actionData`
  bytes4 selector;
  uint256 calldataOffset = 4 + 32 * 3; // Fixed calldata position where `actionData` begins
  assembly {
    selector := calldataload(calldataOffset)
  }
  // ...
}

The above code makes an assumption about where the start of the bytes parameter will be. This is not a safe assumption to make, because as we’ve just learned, we can control where a bytes array will start using the offset field:

  • For Calldata A the above Solidity will produce a selector value of d9caed12.
  • For Calldata B the above Solidity will produce a selector value of 00000000. The behaviour inside execute() could therefore be different, even though the functional intent of the calldata appears the same.

The diagram below compares Calldata A and Calldata B and shows how the selector field is being read for each:

Manual Decoding Calldata B: interpreting the fields
Manual calldata decoding: interpreting the fields

Although, in the above example, Calldata B contains padded unused data that is all zeroes, we can see how easy it would be to trick the selector field derivation. For Calldata B we could enter any 4 bytes we like at position 100, and the rest of the calldata is unaffected.

This is essentially the solution to the Damn Vulnerable DeFi puzzle #15 ABI Smuggling, the full Solidity solution is detailed here.