RudderStack

RudderStack

RudderStack is an open-source customer data pipeline that collects events from websites and apps and routes them to data warehouses and downstream tools. SDK scripts instrument web pages to capture page views, track calls, and identify users, forwarding events to configured destinations. Supports self-hosted and cloud-managed deployments.

Overview

RudderStack is an open-source customer data pipeline (CDP) that collects events from websites, mobile apps, and server-side sources, then routes them to data warehouses, analytics tools, and marketing platforms. Unlike proprietary CDPs, RudderStack can be fully self-hosted on the customer's own infrastructure, giving teams direct control over data residency and routing. The cloud-managed version (RudderStack Cloud) handles infrastructure management while keeping the customer's data flowing through RudderStack's hosted control plane. RudderStack was acquired by Airbyte in 2024, though the product continues to operate independently.

What This Script Does

RudderStack Analytics.js is the browser SDK, loaded from cdn.rudderlabs.com or a customer-configured first-party proxy domain (RudderStack supports proxying the SDK through custom domains to avoid ad blocker interference). The script initializes with a write key and data plane URL pointing to either RudderStack's cloud infrastructure or the customer's self-hosted data plane.

Automatic event collection: On load, the SDK fires an automatic page() call capturing the URL, referrer, page title, and UTM parameters. The SDK intercepts browser history API calls (pushState, replaceState) in single-page applications to fire page() on each route change.

Cookies and local storage for identity:

  • rl_anonymous_id (1-year expiry, first-party cookie or localStorage) — a UUID generated on first visit that persists the anonymous visitor identity across sessions. This is the primary identifier for pre-identification behavioral data.
  • rl_user_id (1-year expiry, first-party) — set by identify() calls, stores the resolved user ID after login or signup
  • rl_trait (1-year expiry, first-party) — stores cached user traits (name, email, plan) passed via identify() for use in subsequent events without re-passing all traits

Track and identify calls: Custom rudderanalytics.track() calls instrument business events. rudderanalytics.identify() links anonymous sessions to known user profiles. Both are transmitted to the configured data plane endpoint (e.g., https://dataplane.example.com/v1/track).

Destination routing: RudderStack supports 200+ downstream destinations. In "device mode," destination SDKs (e.g., Google Analytics 4, Amplitude, Facebook Pixel) are bundled into the RudderStack SDK bundle and execute client-side. In "cloud mode," event data flows server-to-server from RudderStack's data plane to the destination API without additional client-side code. The compliance implications of device-mode destinations are equivalent to loading those destination scripts directly.

Warehouse destinations: Cloud-mode routing to data warehouses (Snowflake, BigQuery, Redshift) means raw event data including user identifiers and behavioral history is written to the customer's own data infrastructure. For self-hosted deployments, data never touches RudderStack's cloud.

Consent & Compliance

RudderStack is classified under the analytics consent category, though its effective scope extends to marketing when advertising or marketing automation destinations are configured in device mode.

Under GDPR, RudderStack sets persistent cookies (rl_anonymous_id) and processes personal data (behavioral events, user identities) before any downstream destination receives data. The rl_anonymous_id 1-year cookie requires consent under the ePrivacy Directive regardless of what data plane it routes to. The identify() integration of personally identifiable data (email, user ID) into the pipeline is personal data processing requiring a lawful basis.

Self-hosted advantage: When RudderStack is fully self-hosted (data plane, control plane, and destinations on the customer's infrastructure), data does not flow to RudderStack's servers. This eliminates the third-party data sharing concern for the pipeline itself, though the consent requirement for the browser-side cookies and tracking remains.

Under CCPA/CPRA, behavioral data collected via RudderStack and routed to downstream marketing or advertising platforms constitutes personal information. The "sale or share" classification depends on which destinations are connected — routing to advertising platforms qualifies; routing to the customer's own data warehouse does not.

RudderStack Cloud is hosted on AWS. EU region workspace data stays in eu-west-1 when the EU data residency option is used. SCCs are available in RudderStack's DPA for cloud deployments.

Should You Block This Without Consent?

Conditional. For self-hosted RudderStack deployments routing only to first-party analytics infrastructure (e.g., Snowflake + internal dashboards), a legitimate interest basis for analytics may be arguable if cookie lifetimes are minimized and data use is strictly internal. However, the rl_anonymous_id 1-year persistent cookie requires consent under ePrivacy. For cloud-managed deployments or configurations that route to marketing or advertising destinations, block until analytics (and marketing, if applicable) consent is obtained.

Visit website

Consent Categories

Analytics

Also Known As

RudderStackRudder Stackopen source CDP consentcustomer data pipeline GDPRevent tracking privacydata warehouse routing

Industries

Programming and Developer SoftwareComputers Electronics and Technology

Tracked Domains (2)

cdn.rudderlabs.comAnalytics
api.rudderlabs.comAnalytics

Frequently Asked Questions

Does RudderStack require consent?

Yes. RudderStack sets a 1-year rl_anonymous_id cookie on page load that requires consent under ePrivacy. For cloud deployments routing to marketing destinations, marketing consent is also required. Self-hosted deployments routing only to internal infrastructure may support legitimate interest, but the persistent cookie still requires consent.

What cookies does RudderStack set?

RudderStack sets rl_anonymous_id (1-year anonymous UUID), rl_user_id (1-year, set after identify() calls), and rl_trait (1-year, caches user attributes like name, email, and plan). Page views and custom track() events fire to the configured data plane, routing to 200+ possible downstream destinations including analytics and ad platforms.

How does ConsentStack handle RudderStack?

ConsentStack blocks the RudderStack Analytics.js snippet until analytics consent is granted, preventing the anonymous UUID cookie from being set. If your RudderStack configuration routes to marketing destinations, ConsentStack can require both analytics and marketing consent before releasing the script.

Related Vendors

Google
Google
Google is the dominant provider of web analytics, advertising, and infrastructure tools. Scripts like Google Analytics, Tag Manager, Ads, and reCAPTCHA collect behavioral data, manage tag firing, serve targeted ads, and detect bots. Sets persistent cookies to track users and correlate activity across sites.
Google Analytics
Google Analytics
Google Analytics is the world's most widely deployed web analytics platform. Scripts track page views, sessions, user demographics, traffic sources, and conversion events. Drops cookies to identify returning visitors and attribute user journeys across sessions.
Firebase
Firebase
Firebase is Google's mobile and web application development platform offering authentication, real-time database, cloud functions, and analytics. Web SDK scripts initialize Firebase services and may track app events via Firebase Analytics, which is powered by Google Analytics 4. Widely used in single-page apps and PWAs for backend infrastructure and usage tracking.
Microsoft
Microsoft
Runs Clarity (session recording and heatmaps), the Microsoft Advertising UET tag (conversion tracking), and Bing's remarketing pixel. Clarity injects a recording script that captures mouse movements, clicks, and rage clicks. The UET tag fires conversion events to tie ad clicks to on-site actions across Microsoft's ad network.
Microsoft Dynamics 365
Microsoft Dynamics 365
Microsoft Dynamics 365 is a suite of CRM and ERP applications that integrates with websites through tracking scripts and embedded forms. Web tracking code captures visitor behavior, page views, and form submissions to build customer profiles and score leads. Sets cookies to identify returning visitors and attribute marketing touchpoints across sessions.
LinkedIn Insight Tag
LinkedIn Insight Tag
LinkedIn Insight Tag is a JavaScript tracking pixel for LinkedIn's advertising and analytics platform. The tag fires on every page view to collect URL, referrer, IP address, and device data for conversion tracking, website demographics reporting, and retargeting audience building. Sets cookies to identify LinkedIn members across advertiser websites.

Manage consent for RudderStack

ConsentStack automatically detects and manages RudderStack trackers so your site stays compliant with global privacy regulations.