Pull Request Checklist
When opening pull requests for Unkey, there are some things we have to be careful about in terms of security and speed.
SQL Injections - ClickHouse
When making changes to ClickHouse queries, we must be careful about passing values from tRPC or any other place. The way ClickHouse is utilized in Unkey doesn't let you pass values without sanitization first through zod schemas, but in some cases, you can bypass that and inject custom values into the query. By default, the ClickHouse client forces you to pass parameters with types to sanitize them, but in the end, it's still SQL, so if you slip, malicious people can still attack you. SQL injection can happen not just through direct string interpolation but also through dynamic SQL construction in general. To prevent this:
- Always use ClickHouse's parameterized queries with typed parameters
- Validate input with strict zod schemas before query construction
- Avoid dynamic SQL generation where possible
Example:
The issue is in the dynamic path conditions where values are directly interpolated into the SQL string without proper escaping:
An attacker could input a value like: ' OR '1'='1
which would create:
To fix this:
- Use ClickHouse's parameter binding:
This ensures all values are properly escaped by ClickHouse's query engine.
SQL Performance - ClickHouse
When querying ClickHouse, we have to be careful about what we are joining or querying since that particular data might be huge. For example, not handling queries to metrics.raw_api_requests_v1 properly can be quite problematic.
To improve query performance:
- Filter data as early as possible in the query
- Apply indexes on frequently filtered columns
Example:
Imagine this query: we are trying to get ratelimit details by joining with raw_api_requests_v1. Since raw_api_requests_v1 might be huge, the query can take a long time to execute or consume too much memory. To optimize this, we should first filter as much as possible, with time and workspace_id being the most important factors. We can make the query much faster by eliminating unrelated workspaceId's and irrelevant time ranges.
Finally, don't forget about adding indexes for our filters. While this step is optional, it can greatly improve query performance:
Leaking Data From Client to Server
When handling data access in tRPC, we must validate that users can only access data they're authorized to see. Failing to check permissions can lead to data leaks.
Example:
The unsafe version lets any authenticated user view logs from any workspace by passing different workspaceIds. Always verify resource ownership through the server context instead of trusting client input.
Pull Request Security Checklist
- SQL queries use parameterized values, not string interpolation
- Input is validated through zod schemas
- Database queries filter data early for performance
- Indexes are added for frequently filtered columns
- Resource access is verified through server context
- Client-provided IDs are never trusted for data access
Last updated on