Using aws s3api list-objects

aws-vault exec account — aws s3api list-objects –bucket “bucket” –query “Contents[].{Key: Key}” –prefix “test” > test.json

Also, confusing about how many files will be returned

From this stack overflow answer https://stackoverflow.com/questions/39054121/how-many-objects-are-returned-by-aws-s3api-list-objects

Returns some or all (up to 1000) of the objects in a bucket. You can use the request parameters as selection criteria to return a subset of the objects in a bucket. [1]

I think that the part “(up to 1000)” in the documentation’s description is highly misleading. It refers to the maximal page size per underlying HTTP request which is sent by the cli. The documentation of the --page-size option makes this clear:

The size of each page to get in the AWS service call. This does not affect the number of items returned in the command’s output. Setting a smaller page size results in more calls to the AWS service, retrieving fewer items in each call. This can help prevent the AWS service calls from timing out.

It gets even clearer when reading the AWS documentation about pagination [2] which describes:

For commands that can return a large list of items, the AWS Command Line Interface (AWS CLI) adds three options that you can use to control the number of items included in the output when the AWS CLI calls a service’s API to populate the list.

By default, the AWS CLI uses a page size of 1000 and retrieves all available items. For example, if you run aws s3api list-objects on an Amazon S3 bucket that contains 3,500 objects, the CLI makes four calls to Amazon S3, handling the service-specific pagination logic for you in the background and returning all 3,500 objects in the final output.

As Ankit already stated correctly, using the --max-items option is the correct solution to limit the result and stop the automatic pagination:

To include fewer items at a time in the AWS CLI output, use the –max-items option. The AWS CLI still handles pagination with the service as described above, but prints out only the number of items at a time that you specify. [2]

References

[1] https://docs.aws.amazon.com/cli/latest/reference/s3api/list-objects.html
[2] https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-pagination.html

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s