Tab-Taste zeigt Sprunglinks an. Enter aktiviert den ausgewählten Sprunglink und navigiert direkt zum entsprechenden Seitenbereich.

Bluesky & Mastodon Scraper API

Extract and monitor posts from Bluesky (AT Protocol) and Mastodon (Fediverse) with a unified, normalized JSON API. The most comprehensive social media scraper for decentralized networks - perfect for social listening, brand monitoring, market rese...

5.0
Social MediaEntwickelt von BarriereFix

Bluesky & Mastodon Scraper API - Decentralized Social Media Data Aggregator

Extract and monitor posts from Bluesky (AT Protocol) and Mastodon (Fediverse) with a unified, normalized JSON API. The most comprehensive social media scraper for decentralized networks - perfect for social listening, brand monitoring, market research, sentiment analysis, and AI training data collection.

🔍 Search by keywords • 👥 Track specific users • 📊 Unified data format • 🪝 Real-time webhooks • 💰 Pay-per-post pricing

🚀 Features

  • Multi-Platform Support: Scrape Bluesky and Mastodon simultaneously
  • Keyword Search: Find posts mentioning specific terms or phrases
  • Handle Tracking: Monitor specific users across platforms
  • Date Range Filtering: Historical and real-time post collection
  • Unified Schema: Normalized output format across all platforms
  • Intelligent Deduplication: Automatic duplicate detection and removal
  • Real-Time Webhooks: Send posts to your endpoints as they're discovered
  • Language Filtering: Filter posts by language (BCP-47 codes)
  • Pay-Per-Event Pricing: Only pay for posts collected, not compute time
  • No Authentication Required: Works with public data (optional auth for higher limits)

📊 Supported Platforms

Bluesky (AT Protocol)

  • Full keyword search via searchActors workaround
  • User feed tracking
  • Quote posts, replies, reposts, likes
  • Media attachments (images, videos, GIFs)
  • Rich metadata (DIDs, handles, timestamps)

Mastodon (Fediverse)

  • Multi-instance support (mastodon.social, mas.to, fosstodon.org, etc.)
  • Full keyword search across instances
  • User timeline tracking
  • Boosts, replies, favorites
  • Media attachments with alt text
  • Instance-specific data

💡 Use Cases

  • Social Listening: Track brand mentions and industry keywords
  • Market Research: Analyze trends and conversations in your niche
  • Sentiment Analysis: Collect data for AI/ML sentiment models
  • Brand Monitoring: Monitor your company and competitors
  • Academic Research: Study social media behavior and network effects
  • Content Discovery: Find engaging content for curation
  • Influencer Tracking: Monitor key voices in your industry

🎯 Quick Start

Example 1: Search for AI-related posts

```json { "platforms": ["bluesky", "mastodon"], "query": "artificial intelligence", "maxItems": 100, "languages": ["en"] } ```

Example 2: Track specific users

```json { "platforms": ["bluesky", "mastodon"], "handles": ["jay.bsky.social", "@gargron@mastodon.social"], "maxItems": 500 } ```

Example 3: Historical search with date range

```json { "platforms": ["bluesky"], "query": "climate change", "since": "2025-09-01T00:00:00Z", "until": "2025-10-01T00:00:00Z", "maxItems": 1000 } ```

Example 4: Real-time monitoring with webhooks

```json { "platforms": ["bluesky", "mastodon"], "query": "crypto", "emitWebhooks": true, "webhooks": [ { "url": "https://your-api.com/webhook", "headers": {"Authorization": "Bearer YOUR_TOKEN"}, "mode": "per_item", "platforms": ["bluesky"] } ] } ```

📥 Input Parameters

ParameterTypeRequiredDescription
`platforms`ArrayPlatforms to scrape: `["bluesky", "mastodon"]`
`query`StringKeywords to search for
`handles`ArraySpecific user handles to track
`since`StringStart date (ISO 8601)
`until`StringEnd date (ISO 8601)
`maxItems`IntegerMax posts to collect (default: 1000)
`languages`ArrayLanguage codes (e.g., `["en", "de"]`)
`includeReplies`BooleanInclude reply posts (default: false)
`emitWebhooks`BooleanEnable webhook delivery
`webhooks`ArrayWebhook endpoint configurations
`blueskyCredentials`ObjectOptional auth for higher rate limits
`mastodonInstances`ArraySpecific Mastodon instances to search
`maxConcurrency`IntegerConcurrent requests (default: 5)
`dryRun`BooleanTest mode without storing data

Note: You must provide either `query` OR `handles` (or both).

📤 Output Schema

Each post is normalized to a unified format:

```json { "platform": "bluesky", "postId": "at://did:plc:xyz/app.bsky.feed.post/3kff...", "url": "https://bsky.app/profile/jay.bsky.social/post/3kff...", "text": "Building the future of social media...", "language": "en", "author": { "handle": "jay.bsky.social", "did": "did:plc:xyz", "displayName": "Jay Graber", "profileUrl": "https://bsky.app/profile/jay.bsky.social" }, "createdAt": "2025-10-08T10:30:00Z", "metrics": { "replies": 42, "reposts": 128, "likes": 567, "quotes": 23 }, "entities": { "hashtags": ["decentralization", "atproto"], "mentions": ["@handle1.bsky.social"] }, "media": [ { "type": "image", "url": "https://cdn.bsky.app/...", "alt": "Screenshot of the app" } ], "source": { "instance": null }, "references": { "replyTo": null, "quotedPost": "at://did:plc:..." }, "ingest_meta": { "first_seen_at": "2025-10-08T11:00:00Z", "adapter_version": "1.0.0" } } ```

🔐 Authentication

Bluesky (Optional)

Works without authentication for public data. For higher rate limits: ```json { "blueskyCredentials": { "identifier": "your-handle.bsky.social", "password": "your-app-password" } } ``` Get app password: Settings → App Passwords → Add App Password

Mastodon

No authentication required for public posts.

🌐 Mastodon Instance Support

Auto-Detection

The actor automatically detects Mastodon instances from handles: ```json { "handles": ["@user@mastodon.social", "@dev@fosstodon.org"] } ```

Manual Configuration

Specify instances explicitly: ```json { "mastodonInstances": ["mastodon.social", "mas.to", "fosstodon.org"] } ```

🪝 Webhooks

Send posts to your endpoints in real-time:

```json { "emitWebhooks": true, "webhooks": [ { "url": "https://api.example.com/posts", "headers": { "Authorization": "Bearer YOUR_TOKEN", "Content-Type": "application/json" }, "secret": "shared-secret-key", "mode": "per_item", "platforms": ["bluesky", "mastodon"] } ] } ```

Webhook Modes:

  • `per_item`: Send each post individually
  • `batch`: Send posts in batches (coming soon)

💰 Pricing

Pay-Per-Event Model: Only pay for posts you collect

  • $0.002 per post ($2 per 1,000 posts)
  • No compute time charges
  • No setup fees
  • Cancel anytime

Examples:

  • 100 posts = $0.20
  • 1,000 posts = $2.00
  • 10,000 posts = $20.00
  • 100,000 posts = $200.00

Simple, transparent pricing - you only pay for what you use.

📅 Scheduling

Run every hour

``` 0 * * * * ```

Run daily at midnight

``` 0 0 * * * ```

Run every 15 minutes

``` */15 * * * * ```

🔄 Deduplication

The actor automatically:

  • Tracks seen posts with state management
  • Skips duplicates across runs
  • Cleans up old state entries (30+ days)

⚡ Performance

  • Speed: ~100-200 posts/minute per platform
  • Rate Limits: Respects platform rate limits automatically
  • Concurrency: Configurable (1-20 concurrent requests)
  • Memory: ~256MB typical, ~512MB for large runs

🛠️ Advanced Configuration

Language Filtering

```json { "languages": ["en", "de", "ja", "es"] } ```

Date Range

```json { "since": "2025-09-01T00:00:00Z", "until": "2025-10-01T00:00:00Z" } ```

Include Replies

```json { "includeReplies": true } ```

Dry Run (Testing)

```json { "dryRun": true } ```

📊 Dataset Views

The actor provides three pre-configured views in Apify Console:

  1. Overview: All posts with key metrics
  2. By Platform: Posts grouped by source
  3. Top Engagement: Sorted by likes/reposts

🔍 Search Tips

Keyword Search

  • Use specific terms: "machine learning" vs "AI"
  • Combine keywords: "climate change policy"
  • Use quotes for exact phrases (Bluesky only)

Handle Formats

  • Bluesky: `jay.bsky.social` or `handle.domain.com`
  • Mastodon: `@username@instance.social` or `instance.social/@username`

Date Ranges

  • Use ISO 8601 format: `2025-10-08T10:30:00Z`
  • Timezone: Always UTC (Z suffix)

⚠️ Limitations

  • Bluesky: Keyword search uses searchActors workaround (may be slower than native search)
  • Mastodon: Search quality depends on instance search capabilities
  • Rate Limits: Public APIs have rate limits (authentication increases limits)
  • Historical Data: Availability depends on platform retention policies

🆘 Support

  • Email: kontakt@barrierefix.de
  • Issues: Report bugs or request features
  • Documentation: Full API docs in source code

📜 License

MIT License - Free to use commercially and privately

🏷️ Tags

`bluesky` `mastodon` `at-protocol` `fediverse` `social-media` `scraper` `aggregator` `decentralized` `web3` `social-listening` `brand-monitoring` `sentiment-analysis` `market-research` `data-collection` `apify`


🔗 Explore More of Our Actors

💬 Social Media & Community

ActorDescription
Reddit Scraper ProMonitor subreddits and track keywords with sentiment analysis
Discord Scraper ProExtract Discord messages and chat history for community insights
YouTube Comments HarvesterComprehensive YouTube comments scraper with channel-wide enumeration
YouTube Contact ScraperExtract YouTube channel contact information for outreach
YouTube Shorts ScraperScrape YouTube Shorts for viral content research

🏢 Business Intelligence

ActorDescription
Indeed Salary AnalyzerGet salary data for compensation benchmarking and HR analytics
Crunchbase ScraperExtract company data and funding information for business intelligence
Northdata ScraperExtract German company data from Northdata for business research
Shopify Store IntelligenceAnalyze Shopify stores for competitive intelligence and market research
Apify Store RadarMonitor Apify Store actors for market intelligence


Built by Barrierefix | Powered by Apify