Welcome to RedShred > Working with Document Data > Document-Level Queries

Document-Level Queries

While RSQL is commonly used for segment queries, it can also be applied to document and perspective instances. This page explains how to use RSQL to query documents directly. Note that the perspective endpoint operates in a similar way as the document endpoint.

Field Access and Dot Notation

The query language exposes fields on the top-level object and subfields via dot notation. This is the same pattern used in segment queries (e.g., perspective.name), but applied to document attributes.

Query Endpoints

Document-level queries can be used with the documents endpoint:

https://api.redshred.com/v2/collections/{collection_name}/documents/

Query Styles

Basic Document Filtering

Querying Documents by File Size

file_size <= 102000

This query finds all documents with a file size less than or equal to 102KB. This demonstrates how RSQL inherits functionality from Django’s Q objects to support numeric comparisons on document attributes.

You can use this query in two ways:

As a direct query parameter to the documents endpoint:

/v2/collections/research-reading/documents/?q=file_size%20%3C=%20102000&fields=self_link,name,file_size

As a field-specific filter parameter (Django-style):

/v2/collections/research-reading/documents/?file_size__lte=102000&fields=self_link,name,file_size

Both approaches achieve the same result, with the first using RSQL syntax and the second using Django’s field lookup syntax.

Document Name Filtering

name ~ /report/i

This finds documents with “report” in their name (case insensitive).

Date Filtering

created_at >= "2023-01-01"

This finds documents created on or after January 1, 2023.

Combining Conditions

file_size <= 102000 and name ~ /report/i

This query finds documents that are both smaller than 102KB and have “report” in their name.

Common Document Attributes

Documents have several attributes that can be queried:

Attribute	Description	Example Query
`name`	Document filename	`name = "quarterly_report.pdf"`
`file_size`	Size in bytes	`file_size <= 1024000`
`created_at`	Creation timestamp	`created_at >= "2023-01-01"`
`updated_at`	Last update timestamp	`updated_at >= "2023-01-01"`
`mime_type`	MIME type	`mime_type = "application/pdf"`
`status`	Processing status	`status = "complete"`

API Response Fields

When querying documents, you can specify which fields to include in the response using the fields parameter:

https://api.staging.redshred.com/v2/collections/research-reading/documents/?q=file_size%20%3C=%20102000&fields=self_link,name,file_size,created_at

This returns only the specified fields for each matching document, which can improve performance for large result sets.

Performance Considerations

For optimal performance when querying documents:

Be as specific as possible with your queries
Use the fields parameter to limit the returned data
Consider using pagination parameters (limit and offset) for large result sets
For complex queries, the Django-style field lookups may offer better performance in some cases