Apify — Skillopedia

Customization Before executing, check for user customizations at: If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults. 🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION) You MUST send this notification BEFORE doing anything else when this skill is invoked. 1. Send voice notification : 2. Output text notification : This is not optional. Execute this curl command immediately upon skill invocation. Apify - Social Media & Web Scraping Direct T…

, '')) \u003c 100\n)\n```\n\n## 🎨 Advanced Patterns\n\n### Pattern 1: Multi-Platform Social Listening\n\n```typescript\nimport {\n scrapeInstagramHashtag,\n scrapeTikTokHashtag,\n searchYouTube\n} from 'actors'\n\n// Run all platforms in parallel\nconst [instagramPosts, tiktokVideos, youtubeVideos] = await Promise.all([\n scrapeInstagramHashtag({ hashtag: 'ai', maxResults: 100 }),\n scrapeTikTokHashtag({ hashtag: 'ai', maxResults: 100 }),\n searchYouTube({ query: '#ai', maxResults: 100 })\n])\n\n// Combine and filter - only viral content across all platforms\nconst allViral = [\n ...instagramPosts.filter(p => p.likesCount > 10000),\n ...tiktokVideos.filter(v => v.playCount > 100000),\n ...youtubeVideos.filter(v => v.viewsCount > 50000)\n]\n\nconsole.log(`Found ${allViral.length} viral posts across 3 platforms`)\n```\n\n### Pattern 2: Lead Enrichment Pipeline\n\n```typescript\nimport { searchGoogleMaps, scrapeLinkedInProfile } from 'actors'\n\n// 1. Find businesses on Google Maps\nconst restaurants = await searchGoogleMaps({\n query: 'restaurants in SF',\n maxResults: 100,\n scrapeContactInfo: true\n})\n\n// 2. Filter for qualified leads\nconst qualified = restaurants.filter(r =>\n r.rating >= 4.5 &&\n r.email &&\n r.reviewsCount >= 50\n)\n\n// 3. Enrich with LinkedIn data (if available)\nconst enriched = await Promise.all(\n qualified.map(async (restaurant) => {\n // Try to find LinkedIn company page\n // ... additional enrichment logic\n return restaurant\n })\n)\n```\n\n### Pattern 3: Competitive Analysis Dashboard\n\n```typescript\nimport {\n scrapeInstagramProfile,\n scrapeYouTubeChannel,\n scrapeTikTokProfile\n} from 'actors'\n\nasync function analyzeCompetitor(username: string) {\n // Gather data from all platforms\n const [instagram, youtube, tiktok] = await Promise.all([\n scrapeInstagramProfile({ username, maxPosts: 30 }),\n scrapeYouTubeChannel({ channelUrl: `https://youtube.com/@${username}`, maxVideos: 30 }),\n scrapeTikTokProfile({ username, maxVideos: 30 })\n ])\n\n // Calculate engagement metrics in code\n return {\n username,\n instagram: {\n followers: instagram.followersCount,\n avgLikes: average(instagram.latestPosts?.map(p => p.likesCount) || []),\n engagementRate: calculateEngagement(instagram)\n },\n youtube: {\n subscribers: youtube.subscribersCount,\n avgViews: average(youtube.videos?.map(v => v.viewsCount) || [])\n },\n tiktok: {\n followers: tiktok.followersCount,\n avgPlays: average(tiktok.videos?.map(v => v.playCount) || [])\n }\n }\n}\n```\n\n## 💰 Token Savings Calculator\n\n**Example: Instagram profile with 100 posts**\n\n**MCP Approach:**\n```\n1. search-actors → 1,000 tokens\n2. call-actor → 1,000 tokens\n3. get-actor-output → 50,000 tokens (100 unfiltered posts)\nTOTAL: ~52,000 tokens\n```\n\n**File-Based Approach:**\n```typescript\nconst profile = await scrapeInstagramProfile({\n username: 'user',\n maxPosts: 100\n})\n\n// Filter in code - only top 10 posts\nconst top = profile.latestPosts\n ?.sort((a, b) => b.likesCount - a.likesCount)\n .slice(0, 10)\n\n// TOTAL: ~500 tokens (only 10 filtered posts reach model)\n```\n\n**Savings: 99% reduction (52,000 → 500 tokens)**\n\n## 🔧 Actor Reference\n\n### Social Media\n\n#### Instagram\n- `scrapeInstagramProfile(input)` - Profile + posts\n- `scrapeInstagramPosts(input)` - Posts from user\n- `scrapeInstagramHashtag(input)` - Posts by hashtag\n- `scrapeInstagramComments(input)` - Comments on post\n\n#### LinkedIn\n- `scrapeLinkedInProfile(input)` - Profile + experience + email\n- `searchLinkedInJobs(input)` - Job listings\n- `scrapeLinkedInPosts(input)` - Posts from profile/company\n\n#### TikTok\n- `scrapeTikTokProfile(input)` - Profile + videos\n- `scrapeTikTokHashtag(input)` - Videos by hashtag\n- `scrapeTikTokComments(input)` - Comments on video\n\n#### YouTube\n- `scrapeYouTubeChannel(input)` - Channel + videos\n- `searchYouTube(input)` - Search videos\n- `scrapeYouTubeComments(input)` - Comments on video\n\n#### Facebook\n- `scrapeFacebookPosts(input)` - Posts from pages\n- `scrapeFacebookGroups(input)` - Group posts\n- `scrapeFacebookComments(input)` - Post comments\n\n### Business & Lead Generation\n\n#### Google Maps\n- `searchGoogleMaps(input)` - Search places (with contact extraction!)\n- `scrapeGoogleMapsPlace(input)` - Single place details\n- `scrapeGoogleMapsReviews(input)` - Place reviews\n\n### E-commerce\n\n#### Amazon\n- `scrapeAmazonProduct(input)` - Product details + reviews\n- `scrapeAmazonReviews(input)` - Product reviews only\n\n### Web Scraping\n\n#### General Web\n- `scrapeWebsite(input)` - Custom multi-page crawling\n- `scrapePage(url, pageFunction)` - Single page extraction\n\n## ⚙️ Configuration\n\n**Environment Variables:**\n```bash\n# Required - Get from https://console.apify.com/account/integrations\nAPIFY_TOKEN=apify_api_xxxxx...\n```\n\n**Actor Run Options:**\n```typescript\n{\n memory: 2048, // MB: 128, 256, 512, 1024, 2048, 4096, 8192\n timeout: 300, // seconds\n build: 'latest' // or specific build number\n}\n```\n\n## 🎯 When to Use This vs MCP\n\n**Use File-Based (this skill):**\n- ✅ Need to filter large datasets (>100 results)\n- ✅ Want to transform/aggregate data in code\n- ✅ Multiple sequential operations\n- ✅ Control flow (loops, conditionals)\n- ✅ Maximum token efficiency\n\n**Use MCP:**\n- ❌ Simple single operations with small results (\u003c10 items)\n- ❌ One-off exploratory queries\n- ❌ Don't want to write code\n\n## 🔗 Links\n\n- Apify Platform: https://apify.com\n- Actor Store: https://apify.com/store\n- API Docs: https://docs.apify.com/api/v2\n\n---\n\n**Remember: Filter data in code BEFORE returning to model context. This is where the 99% token savings happen!**\n\n## Gotchas\n\n- **Actor selection matters.** Each social platform has specific actors — don't use a generic scraper for Instagram when a dedicated Instagram actor exists.\n- **Rate limits vary by platform and plan.** Check actor documentation for limits before running large scrapes.\n- **Scraped data format varies by actor.** Read the actor's output schema before processing results.\n\n## Examples\n\n**Example 1: Scrape Instagram profile**\n```\nUser: \"get the recent posts from this Instagram account\"\n→ Selects Instagram Profile actor\n→ Runs with target profile URL\n→ Returns structured post data (text, engagement, dates)\n```\n\n**Example 2: LinkedIn company scrape**\n```\nUser: \"scrape this company's LinkedIn page\"\n→ Selects LinkedIn Company actor\n→ Returns company info, employee count, recent posts\n```\n\n## Execution Log\n\nAfter completing any workflow, append a single JSONL entry:\n\n```bash\necho '{\"ts\":\"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'\",\"skill\":\"Apify\",\"workflow\":\"WORKFLOW_USED\",\"input\":\"8_WORD_SUMMARY\",\"status\":\"ok|error\",\"duration_s\":SECONDS}' >> ~/.claude/PAI/MEMORY/SKILLS/execution.jsonl\n```\n\nReplace `WORKFLOW_USED` with the workflow executed, `8_WORD_SUMMARY` with a brief input description, and `SECONDS` with approximate wall-clock time. Log `status: \"error\"` if the workflow failed.\n---","attachment_filenames":["actors/business/google-maps.ts","actors/business/index.ts","actors/ecommerce/amazon.ts","actors/ecommerce/index.ts","actors/index.ts","actors/social-media/facebook.ts","actors/social-media/index.ts","actors/social-media/instagram.ts","actors/social-media/linkedin.ts","actors/social-media/tiktok.ts","actors/social-media/twitter.ts","actors/social-media/youtube.ts","actors/web/index.ts","actors/web/web-scraper.ts","examples/comparison-test.ts","examples/instagram-scraper.ts","examples/smoke-test.ts","index.ts","INTEGRATION.md","package.json","README.md","skills/get-user-tweets.ts","tsconfig.json","types/common.ts","types/index.ts","Workflows/Update.md"],"attachments":[{"filename":"actors/business/google-maps.ts","content":"/**\n * Google Maps Scraper\n *\n * Apify Actor: compass/crawler-google-places (198,093 users, 4.76 rating)\n * Pricing: $0.001-$0.007 per event (Actor start + per place + optional add-ons)\n *\n * HIGHEST VALUE ACTOR - 198k users!\n * Extract Google Maps business data, reviews, contacts, images - perfect for lead generation.\n */\n\nimport { Apify } from '../../index'\nimport type {\n BusinessInfo,\n Location,\n ContactInfo,\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface GoogleMapsSearchInput extends PaginationOptions {\n /** Search query (e.g., \"restaurants in San Francisco\") */\n query: string\n /** Maximum number of places to scrape */\n maxResults?: number\n /** Include reviews for each place */\n includeReviews?: boolean\n /** Maximum reviews per place */\n maxReviewsPerPlace?: number\n /** Include images */\n includeImages?: boolean\n /** Scrape contact information from websites */\n scrapeContactInfo?: boolean\n /** Language code (en, es, fr, de, etc.) */\n language?: string\n /** Country code for search region */\n country?: string\n}\n\nexport interface GoogleMapsPlaceInput {\n /** Google Maps place URL or Place ID */\n placeUrl: string\n /** Include reviews */\n includeReviews?: boolean\n /** Maximum reviews to scrape */\n maxReviews?: number\n /** Include images */\n includeImages?: boolean\n /** Scrape contact info from website */\n scrapeContactInfo?: boolean\n}\n\nexport interface GoogleMapsPlace extends BusinessInfo {\n placeId: string\n name: string\n url: string\n category?: string\n categories?: string[]\n address?: string\n location?: Location\n rating?: number\n reviewsCount?: number\n priceLevel?: number\n phone?: string\n website?: string\n email?: string\n openingHours?: OpeningHours\n popularTimes?: PopularTimes[]\n isTemporarilyClosed?: boolean\n isPermanentlyClosed?: boolean\n totalScore?: number\n reviewsDistribution?: ReviewsDistribution\n imageUrls?: string[]\n reviews?: GoogleMapsReview[]\n contactInfo?: ContactInfo\n socialMedia?: {\n facebook?: string\n twitter?: string\n instagram?: string\n linkedin?: string\n }\n verificationStatus?: string\n}\n\nexport interface OpeningHours {\n monday?: string\n tuesday?: string\n wednesday?: string\n thursday?: string\n friday?: string\n saturday?: string\n sunday?: string\n}\n\nexport interface PopularTimes {\n day: string\n hours: Array\u003c{\n hour: number\n occupancyPercent: number\n }>\n}\n\nexport interface ReviewsDistribution {\n oneStar?: number\n twoStar?: number\n threeStar?: number\n fourStar?: number\n fiveStar?: number\n}\n\nexport interface GoogleMapsReview {\n id?: string\n text: string\n publishedAtDate: string\n rating: number\n likesCount?: number\n reviewerId?: string\n reviewerName?: string\n reviewerPhotoUrl?: string\n reviewerReviewsCount?: number\n responseFromOwner?: string\n responseFromOwnerDate?: string\n imageUrls?: string[]\n}\n\nexport interface GoogleMapsReviewsInput extends PaginationOptions {\n /** Google Maps place URL */\n placeUrl: string\n /** Maximum number of reviews to scrape */\n maxResults?: number\n /** Minimum rating filter (1-5) */\n minRating?: number\n /** Language code */\n language?: string\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Search Google Maps for places matching a query\n *\n * @param input - Search parameters\n * @param options - Actor run options\n * @returns Array of Google Maps places\n *\n * @example\n * ```typescript\n * // Search for coffee shops in SF\n * const places = await searchGoogleMaps({\n * query: 'coffee shops in San Francisco',\n * maxResults: 50,\n * includeReviews: true,\n * maxReviewsPerPlace: 10\n * })\n *\n * // Filter in code - only highly rated with many reviews\n * const topCoffeeShops = places\n * .filter(p => p.rating >= 4.5 && p.reviewsCount >= 100)\n * .sort((a, b) => b.rating - a.rating)\n * .slice(0, 10)\n *\n * // Extract emails for lead generation\n * const leads = topCoffeeShops\n * .filter(p => p.email)\n * .map(p => ({ name: p.name, email: p.email, phone: p.phone }))\n * ```\n */\nexport async function searchGoogleMaps(\n input: GoogleMapsSearchInput,\n options?: ActorRunOptions\n): Promise\u003cGoogleMapsPlace[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('compass/crawler-google-places', {\n searchStringsArray: [input.query],\n maxCrawledPlacesPerSearch: input.maxResults || 50,\n language: input.language || 'en',\n countryCode: input.country,\n includeReviews: input.includeReviews || false,\n maxReviews: input.maxReviewsPerPlace || 0,\n includeImages: input.includeImages || false,\n scrapeCompanyEmails: input.scrapeContactInfo || false,\n scrapeSocialMediaLinks: input.scrapeContactInfo || false\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Google Maps search failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map(transformPlace)\n}\n\n/**\n * Scrape detailed data for a specific Google Maps place\n *\n * @param input - Place scraping parameters\n * @param options - Actor run options\n * @returns Detailed place information\n *\n * @example\n * ```typescript\n * // Scrape a specific place with reviews\n * const place = await scrapeGoogleMapsPlace({\n * placeUrl: 'https://maps.google.com/maps?cid=12345',\n * includeReviews: true,\n * maxReviews: 100,\n * scrapeContactInfo: true\n * })\n *\n * // Filter reviews in code - only recent 5-star reviews\n * const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)\n * const recentExcellent = place.reviews?.filter(r =>\n * r.rating === 5 &&\n * new Date(r.publishedAtDate).getTime() > thirtyDaysAgo\n * )\n * ```\n */\nexport async function scrapeGoogleMapsPlace(\n input: GoogleMapsPlaceInput,\n options?: ActorRunOptions\n): Promise\u003cGoogleMapsPlace> {\n const apify = new Apify()\n\n const run = await apify.callActor('compass/crawler-google-places', {\n startUrls: [input.placeUrl],\n includeReviews: input.includeReviews || false,\n maxReviews: input.maxReviews || 0,\n includeImages: input.includeImages || false,\n scrapeCompanyEmails: input.scrapeContactInfo || false,\n scrapeSocialMediaLinks: input.scrapeContactInfo || false\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Google Maps place scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit: 1 })\n\n if (items.length === 0) {\n throw new Error(`Place not found: ${input.placeUrl}`)\n }\n\n return transformPlace(items[0])\n}\n\n/**\n * Scrape reviews for a Google Maps place\n *\n * @param input - Review scraping parameters\n * @param options - Actor run options\n * @returns Array of reviews\n *\n * @example\n * ```typescript\n * // Get 500 reviews for sentiment analysis\n * const reviews = await scrapeGoogleMapsReviews({\n * placeUrl: 'https://maps.google.com/maps?cid=12345',\n * maxResults: 500,\n * language: 'en'\n * })\n *\n * // Filter in code - only detailed reviews\n * const detailedReviews = reviews.filter(r =>\n * r.text.length > 100 &&\n * r.imageUrls && r.imageUrls.length > 0\n * )\n *\n * // Analyze sentiment by rating\n * const negative = reviews.filter(r => r.rating \u003c= 2)\n * const positive = reviews.filter(r => r.rating >= 4)\n * ```\n */\nexport async function scrapeGoogleMapsReviews(\n input: GoogleMapsReviewsInput,\n options?: ActorRunOptions\n): Promise\u003cGoogleMapsReview[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('compass/Google-Maps-Reviews-Scraper', {\n startUrls: [input.placeUrl],\n maxReviews: input.maxResults || 100,\n reviewsSort: 'newest',\n language: input.language || 'en'\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Google Maps reviews scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n // Filter by rating if specified\n let reviews = items.map(transformReview)\n if (input.minRating) {\n reviews = reviews.filter(r => r.rating >= input.minRating!)\n }\n\n return reviews\n}\n\n/* ============================================================================\n * HELPERS\n * ========================================================================= */\n\n/**\n * Transform raw Google Maps place data to our standard format\n */\nfunction transformPlace(place: any): GoogleMapsPlace {\n return {\n placeId: place.placeId,\n name: place.title || place.name,\n url: place.url,\n category: place.categoryName,\n categories: place.categories || [place.categoryName],\n address: place.address,\n location: {\n latitude: place.location?.lat,\n longitude: place.location?.lng,\n address: place.address,\n city: place.city,\n state: place.state,\n country: place.countryCode,\n postalCode: place.postalCode\n },\n rating: place.totalScore,\n totalScore: place.totalScore,\n reviewsCount: place.reviewsCount,\n priceLevel: place.priceLevel,\n phone: place.phone,\n website: place.website,\n email: place.email || place.companyEmail,\n openingHours: place.openingHours,\n popularTimes: place.popularTimesHistogram,\n isTemporarilyClosed: place.temporarilyClosed,\n isPermanentlyClosed: place.permanentlyClosed,\n reviewsDistribution: place.reviewsDistribution,\n imageUrls: place.imageUrls,\n reviews: place.reviews?.map(transformReview),\n contact: {\n email: place.email || place.companyEmail,\n phone: place.phone,\n website: place.website,\n socialMedia: {\n facebook: place.facebookUrl,\n twitter: place.twitterUrl,\n instagram: place.instagramUrl,\n linkedin: place.linkedinUrl\n }\n },\n contactInfo: {\n email: place.email || place.companyEmail,\n phone: place.phone,\n website: place.website\n },\n socialMedia: {\n facebook: place.facebookUrl,\n twitter: place.twitterUrl,\n instagram: place.instagramUrl,\n linkedin: place.linkedinUrl\n },\n verificationStatus: place.claimThisBusiness\n }\n}\n\n/**\n * Transform raw Google Maps review data to our standard format\n */\nfunction transformReview(review: any): GoogleMapsReview {\n return {\n id: review.reviewId,\n text: review.text || review.reviewText,\n publishedAtDate: review.publishedAtDate || review.publishAt,\n rating: review.stars || review.rating,\n likesCount: review.likesCount,\n reviewerId: review.reviewerId,\n reviewerName: review.name || review.reviewerName,\n reviewerPhotoUrl: review.profilePhotoUrl || review.reviewerPhotoUrl,\n reviewerReviewsCount: review.reviewerNumberOfReviews,\n responseFromOwner: review.responseFromOwnerText,\n responseFromOwnerDate: review.responseFromOwnerDate,\n imageUrls: review.reviewImageUrls\n }\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":11520,"content_sha256":"1bfe4eb2820980c67e4ef57666642392815eacdb0029b8158261df6dc4745be2"},{"filename":"actors/business/index.ts","content":"/**\n * Business & Lead Generation Actors\n *\n * - Google Maps (198k users - HIGHEST VALUE!)\n */\n\nexport * from './google-maps'\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":126,"content_sha256":"3c200940d341b7afa7f427f025e2949c95dd436c19bb34e4d39a54e2aac8f608"},{"filename":"actors/ecommerce/amazon.ts","content":"/**\n * Amazon Scraper\n *\n * Top Actors:\n * - junglee/free-amazon-product-scraper (8,898 users, 4.97 rating)\n * - axesso_data/amazon-reviews-scraper (1,647 users, 4.62 rating, $0.75/1k reviews)\n *\n * Extract Amazon product data, reviews, pricing without API.\n */\n\nimport { Apify } from '../../index'\nimport type {\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface AmazonProductInput {\n /** Amazon product URL or ASIN */\n productUrl: string\n /** Include reviews */\n includeReviews?: boolean\n /** Maximum reviews to scrape */\n maxReviews?: number\n}\n\nexport interface AmazonProduct {\n asin: string\n title: string\n url: string\n price?: number\n currency?: string\n priceString?: string\n originalPrice?: number\n discount?: string\n rating?: number\n reviewsCount?: number\n stars?: number\n description?: string\n features?: string[]\n images?: string[]\n variants?: ProductVariant[]\n availability?: string\n inStock?: boolean\n seller?: string\n brand?: string\n category?: string\n reviews?: AmazonReview[]\n}\n\nexport interface ProductVariant {\n asin: string\n title: string\n price?: number\n imageUrl?: string\n}\n\nexport interface AmazonReviewsInput extends PaginationOptions {\n /** Amazon product URL or ASIN */\n productUrl: string\n /** Maximum reviews to scrape */\n maxResults?: number\n /** Star rating filter (1-5) */\n starRating?: number\n /** Verified purchases only */\n verifiedOnly?: boolean\n}\n\nexport interface AmazonReview {\n id: string\n title: string\n text: string\n rating: number\n date: string\n verifiedPurchase?: boolean\n helpful?: number\n reviewerName?: string\n reviewerUrl?: string\n images?: string[]\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape Amazon product data\n *\n * @param input - Product scraping options\n * @param options - Actor run options\n * @returns Amazon product details\n *\n * @example\n * ```typescript\n * const product = await scrapeAmazonProduct({\n * productUrl: 'https://www.amazon.com/dp/B08L5VT894',\n * includeReviews: true,\n * maxReviews: 50\n * })\n *\n * console.log(`${product.title} - ${product.price}`)\n * console.log(`Rating: ${product.rating}/5 (${product.reviewsCount} reviews)`)\n *\n * // Filter reviews in code - only 5-star verified purchases\n * const topReviews = product.reviews?.filter(r =>\n * r.rating === 5 && r.verifiedPurchase\n * )\n * ```\n */\nexport async function scrapeAmazonProduct(\n input: AmazonProductInput,\n options?: ActorRunOptions\n): Promise\u003cAmazonProduct> {\n const apify = new Apify()\n\n const run = await apify.callActor('junglee/free-amazon-product-scraper', {\n startUrls: [input.productUrl],\n maxReviews: input.maxReviews || 0,\n includeReviews: input.includeReviews || false\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Amazon product scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit: 1 })\n\n if (items.length === 0) {\n throw new Error(`Product not found: ${input.productUrl}`)\n }\n\n const product = items[0]\n\n return {\n asin: product.asin,\n title: product.title,\n url: product.url || input.productUrl,\n price: product.price,\n currency: product.currency,\n priceString: product.priceString,\n originalPrice: product.originalPrice,\n discount: product.discount,\n rating: product.stars || product.rating,\n stars: product.stars,\n reviewsCount: product.reviews || product.reviewsCount,\n description: product.description,\n features: product.features || product.featureBullets,\n images: product.images,\n variants: product.variants,\n availability: product.availability,\n inStock: product.inStock,\n seller: product.seller,\n brand: product.brand,\n category: product.category,\n reviews: product.topReviews?.map((r: any) => ({\n id: r.id,\n title: r.title,\n text: r.text || r.body,\n rating: r.stars || r.rating,\n date: r.date,\n verifiedPurchase: r.verified,\n helpful: r.helpful,\n reviewerName: r.reviewer,\n reviewerUrl: r.reviewerUrl,\n images: r.images\n }))\n }\n}\n\n/**\n * Scrape Amazon product reviews\n *\n * @param input - Review scraping options\n * @param options - Actor run options\n * @returns Array of Amazon reviews\n *\n * @example\n * ```typescript\n * const reviews = await scrapeAmazonReviews({\n * productUrl: 'https://www.amazon.com/dp/B08L5VT894',\n * maxResults: 500,\n * verifiedOnly: true\n * })\n *\n * // Filter in code - only detailed reviews\n * const detailed = reviews.filter(r =>\n * r.text.length > 200 &&\n * r.images && r.images.length > 0\n * )\n *\n * // Analyze sentiment by star rating\n * const positive = reviews.filter(r => r.rating >= 4)\n * const negative = reviews.filter(r => r.rating \u003c= 2)\n * console.log(`Sentiment: ${positive.length}+ / ${negative.length}-`)\n * ```\n */\nexport async function scrapeAmazonReviews(\n input: AmazonReviewsInput,\n options?: ActorRunOptions\n): Promise\u003cAmazonReview[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('axesso_data/amazon-reviews-scraper', {\n urls: [input.productUrl],\n maxReviews: input.maxResults || 100,\n starRating: input.starRating,\n verifiedPurchaseOnly: input.verifiedOnly\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Amazon reviews scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((review: any) => ({\n id: review.id || review.reviewId,\n title: review.title,\n text: review.text || review.body,\n rating: review.stars || review.rating,\n date: review.date,\n verifiedPurchase: review.verifiedPurchase || review.verified,\n helpful: review.helpful || review.helpfulCount,\n reviewerName: review.reviewerName || review.author,\n reviewerUrl: review.reviewerUrl,\n images: review.images || review.reviewImages\n }))\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":6526,"content_sha256":"8a474cf08694e65a83841b535d6a651e4b148d198d48520120360605d4be5c06"},{"filename":"actors/ecommerce/index.ts","content":"/**\n * E-commerce Actors\n *\n * - Amazon (products, reviews, pricing)\n */\n\nexport * from './amazon'\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":99,"content_sha256":"62d60f833cb2a9db8d4970e3eb7bc1d015c537e9967c51b78c5b9d79f8ede8cc"},{"filename":"actors/index.ts","content":"/**\n * Apify Actors - File-Based API Wrappers\n *\n * Direct API access to the most popular Apify actors without MCP overhead.\n * Filter data in code BEFORE returning to model context for massive token savings.\n *\n * Categories:\n * - Social Media: Instagram, LinkedIn, TikTok, YouTube, Facebook\n * - Business: Google Maps (lead generation)\n * - E-commerce: Amazon\n * - Web: General-purpose web scraper\n *\n * Token Efficiency Example:\n * - MCP approach: ~50,000 tokens (full unfiltered dataset)\n * - Code-first approach: ~500 tokens (filtered top 10 results)\n * - Savings: 99% token reduction!\n *\n * @example\n * ```typescript\n * import { scrapeInstagramProfile, searchGoogleMaps } from ''\n *\n * // Instagram profile with filtering\n * const profile = await scrapeInstagramProfile({\n * username: 'exampleuser',\n * maxPosts: 50\n * })\n *\n * // Filter in code - only viral posts\n * const viral = profile.latestPosts?.filter(p => p.likesCount > 10000)\n *\n * // Google Maps lead generation\n * const places = await searchGoogleMaps({\n * query: 'coffee shops in San Francisco',\n * maxResults: 100,\n * scrapeContactInfo: true\n * })\n *\n * // Filter in code - only highly rated with email\n * const leads = places\n * .filter(p => p.rating >= 4.5 && p.email)\n * .map(p => ({ name: p.name, email: p.email, phone: p.phone }))\n * ```\n */\n\n// Social Media\nexport * from './social-media'\n\n// Business & Lead Generation\nexport * from './business'\n\n// E-commerce\nexport * from './ecommerce'\n\n// Web Scraping\nexport * from './web'\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":1520,"content_sha256":"d24d3ee5f1fe955848815fa0f3a51a77fbb9f930acb29b65a280dd72bcf580d0"},{"filename":"actors/social-media/facebook.ts","content":"/**\n * Facebook Scraper\n *\n * Top Actors:\n * - apify/facebook-posts-scraper (35,226 users, 4.56 rating)\n * - apify/facebook-groups-scraper (16,182 users, 4.19 rating)\n * - apify/facebook-comments-scraper (17,173 users, 4.46 rating)\n *\n * Extract Facebook posts, groups, comments, pages without login.\n */\n\nimport { Apify } from '../../index'\nimport type {\n Post,\n UserProfile,\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface FacebookPostsInput extends PaginationOptions {\n /** Facebook page or profile URLs */\n pageUrls: string[]\n /** Maximum number of posts per page */\n maxPostsPerPage?: number\n /** Date range filter */\n fromDate?: string\n toDate?: string\n}\n\nexport interface FacebookPost extends Post {\n id: string\n url: string\n text?: string\n postDate: string\n pageUrl?: string\n pageName?: string\n likesCount?: number\n commentsCount?: number\n sharesCount?: number\n imageUrls?: string[]\n videoUrl?: string\n type?: 'post' | 'video' | 'image' | 'link'\n}\n\nexport interface FacebookGroupsInput extends PaginationOptions {\n /** Facebook group URLs */\n groupUrls: string[]\n /** Maximum posts per group */\n maxPostsPerGroup?: number\n /** Include comments */\n includeComments?: boolean\n}\n\nexport interface FacebookGroupPost extends FacebookPost {\n groupName?: string\n groupUrl?: string\n authorName?: string\n authorUrl?: string\n comments?: FacebookComment[]\n}\n\nexport interface FacebookCommentsInput extends PaginationOptions {\n /** Facebook post URLs */\n postUrls: string[]\n /** Maximum comments per post */\n maxCommentsPerPost?: number\n}\n\nexport interface FacebookComment {\n id: string\n text: string\n date: string\n likesCount?: number\n authorName?: string\n authorUrl?: string\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape Facebook posts from pages or profiles\n *\n * @param input - Posts scraping options\n * @param options - Actor run options\n * @returns Array of Facebook posts\n *\n * @example\n * ```typescript\n * const posts = await scrapeFacebookPosts({\n * pageUrls: ['https://www.facebook.com/SomePage'],\n * maxPostsPerPage: 100\n * })\n *\n * // Filter in code - only high-engagement posts\n * const viral = posts.filter(p =>\n * p.likesCount > 1000 || p.sharesCount > 100\n * )\n * ```\n */\nexport async function scrapeFacebookPosts(\n input: FacebookPostsInput,\n options?: ActorRunOptions\n): Promise\u003cFacebookPost[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apify/facebook-posts-scraper', {\n startUrls: input.pageUrls.map(url => ({ url })),\n maxPosts: input.maxPostsPerPage || 50,\n fromDate: input.fromDate,\n toDate: input.toDate\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Facebook posts scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((post: any) => ({\n id: post.id,\n url: post.url,\n text: post.text,\n caption: post.text,\n postDate: post.time,\n timestamp: post.time,\n pageUrl: post.pageUrl,\n pageName: post.pageName,\n likesCount: post.likes,\n commentsCount: post.comments,\n sharesCount: post.shares,\n imageUrls: post.images,\n videoUrl: post.video,\n type: post.postType\n }))\n}\n\n/**\n * Scrape Facebook groups posts\n *\n * @param input - Groups scraping options\n * @param options - Actor run options\n * @returns Array of group posts\n *\n * @example\n * ```typescript\n * const posts = await scrapeFacebookGroups({\n * groupUrls: ['https://www.facebook.com/groups/somegroupid'],\n * maxPostsPerGroup: 50,\n * includeComments: true\n * })\n *\n * // Filter in code - only posts with active discussion\n * const activeDiscussions = posts.filter(p =>\n * p.commentsCount > 10\n * )\n * ```\n */\nexport async function scrapeFacebookGroups(\n input: FacebookGroupsInput,\n options?: ActorRunOptions\n): Promise\u003cFacebookGroupPost[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apify/facebook-groups-scraper', {\n startUrls: input.groupUrls.map(url => ({ url })),\n maxPosts: input.maxPostsPerGroup || 50,\n includeComments: input.includeComments || false\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Facebook groups scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((post: any) => ({\n id: post.id,\n url: post.url,\n text: post.text,\n caption: post.text,\n postDate: post.time,\n timestamp: post.time,\n groupName: post.groupName,\n groupUrl: post.groupUrl,\n authorName: post.authorName,\n authorUrl: post.authorUrl,\n likesCount: post.likes,\n commentsCount: post.comments,\n sharesCount: post.shares,\n imageUrls: post.images,\n videoUrl: post.video,\n comments: post.comments?.map((c: any) => ({\n id: c.id,\n text: c.text,\n date: c.time,\n likesCount: c.likes,\n authorName: c.authorName,\n authorUrl: c.authorUrl\n }))\n }))\n}\n\n/**\n * Scrape Facebook comments from posts\n *\n * @param input - Comments scraping options\n * @param options - Actor run options\n * @returns Array of comments\n *\n * @example\n * ```typescript\n * const comments = await scrapeFacebookComments({\n * postUrls: ['https://www.facebook.com/post/123'],\n * maxCommentsPerPost: 200\n * })\n *\n * // Filter in code - only highly-liked comments\n * const topComments = comments\n * .filter(c => c.likesCount > 50)\n * .sort((a, b) => b.likesCount - a.likesCount)\n * ```\n */\nexport async function scrapeFacebookComments(\n input: FacebookCommentsInput,\n options?: ActorRunOptions\n): Promise\u003cFacebookComment[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apify/facebook-comments-scraper', {\n startUrls: input.postUrls.map(url => ({ url })),\n maxComments: input.maxCommentsPerPost || 100\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Facebook comments scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((comment: any) => ({\n id: comment.id,\n text: comment.text,\n date: comment.time,\n likesCount: comment.likes,\n authorName: comment.authorName,\n authorUrl: comment.authorUrl\n }))\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":7149,"content_sha256":"bca2f7988fc43f89c03ed30432aef7539d0e5c9876e8b8d830bc9187c6a1aea2"},{"filename":"actors/social-media/index.ts","content":"/**\n * Social Media Actors\n *\n * Comprehensive social media scraping capabilities:\n * - Instagram (145k users)\n * - LinkedIn (26k users)\n * - TikTok (90k users)\n * - YouTube (40k users)\n * - Facebook (35k users)\n * - Twitter/X (Unlimited)\n */\n\nexport * from './instagram'\nexport * from './linkedin'\nexport * from './tiktok'\nexport * from './youtube'\nexport * from './facebook'\nexport * from './twitter'\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":403,"content_sha256":"90ca00741ca2058e8a7f92540908f5204ba9b5aad76d741b5cd3f72f4860d97f"},{"filename":"actors/social-media/instagram.ts","content":"/**\n * Instagram Scraper\n *\n * Apify Actor: apify/instagram-scraper (145,279 users, 4.60 rating)\n * Pricing: $0.50-$2.70 per 1000 results (tiered)\n *\n * Extract Instagram profiles, posts, hashtags, comments without login.\n */\n\nimport { Apify } from '../../index'\nimport type {\n UserProfile,\n Post,\n EngagementMetrics,\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface InstagramProfileInput {\n /** Instagram username (without @) */\n username: string\n /** Maximum number of latest posts to include */\n maxPosts?: number\n /** Include profile metadata */\n includeMetadata?: boolean\n}\n\nexport interface InstagramProfile extends UserProfile {\n username: string\n fullName: string\n biography?: string\n externalUrl?: string\n followersCount: number\n followingCount: number\n postsCount: number\n isPrivate?: boolean\n isVerified?: boolean\n latestPosts?: InstagramPost[]\n}\n\nexport interface InstagramPost extends Post {\n id: string\n shortCode: string\n url: string\n caption?: string\n imageUrl?: string\n videoUrl?: string\n likesCount: number\n commentsCount: number\n timestamp: string\n type: 'Image' | 'Video' | 'Sidecar'\n location?: {\n name?: string\n slug?: string\n }\n hashtags?: string[]\n mentions?: string[]\n isSponsored?: boolean\n}\n\nexport interface InstagramPostsInput extends PaginationOptions {\n /** Instagram username (without @) */\n username: string\n /** Maximum number of posts to scrape */\n maxResults?: number\n}\n\nexport interface InstagramHashtagInput extends PaginationOptions {\n /** Hashtag (without #) */\n hashtag: string\n /** Maximum number of posts to scrape */\n maxResults?: number\n}\n\nexport interface InstagramCommentInput extends PaginationOptions {\n /** Instagram post URL */\n postUrl: string\n /** Maximum number of comments to scrape */\n maxResults?: number\n}\n\nexport interface InstagramComment {\n id: string\n text: string\n timestamp: string\n likesCount: number\n ownerUsername: string\n ownerProfilePicUrl?: string\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape Instagram profile data\n *\n * @param input - Profile scraping options\n * @param options - Actor run options (memory, timeout)\n * @returns Profile data with optional latest posts\n *\n * @example\n * ```typescript\n * // Get profile with latest 12 posts\n * const profile = await scrapeInstagramProfile({\n * username: 'exampleuser',\n * maxPosts: 12\n * })\n *\n * // Filter in code - only high-engagement posts\n * const viralPosts = profile.latestPosts?.filter(p => p.likesCount > 10000)\n * console.log(`Found ${viralPosts?.length} viral posts`)\n * ```\n */\nexport async function scrapeInstagramProfile(\n input: InstagramProfileInput,\n options?: ActorRunOptions\n): Promise\u003cInstagramProfile> {\n const apify = new Apify()\n\n // Call the Instagram Profile Scraper actor\n const run = await apify.callActor('apify/instagram-profile-scraper', {\n usernames: [input.username],\n resultsLimit: input.maxPosts || 12\n }, options)\n\n // Wait for completion\n await apify.waitForRun(run.id)\n\n // Check status\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Instagram profile scraping failed: ${finalRun.status}`)\n }\n\n // Get results\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit: 1 })\n\n if (items.length === 0) {\n throw new Error(`Profile not found: @${input.username}`)\n }\n\n const profile = items[0]\n\n // Transform to our interface\n return {\n id: profile.id,\n username: profile.username,\n fullName: profile.fullName || profile.username,\n biography: profile.biography,\n externalUrl: profile.externalUrl,\n profilePictureUrl: profile.profilePicUrl,\n followersCount: profile.followersCount || 0,\n followingCount: profile.followsCount || 0,\n postsCount: profile.postsCount || 0,\n isPrivate: profile.private,\n isVerified: profile.verified,\n verified: profile.verified,\n latestPosts: profile.latestPosts?.map((post: any) => transformPost(post))\n }\n}\n\n/**\n * Scrape Instagram posts from a profile\n *\n * @param input - Posts scraping options\n * @param options - Actor run options\n * @returns Array of Instagram posts\n *\n * @example\n * ```typescript\n * // Scrape latest 50 posts\n * const posts = await scrapeInstagramPosts({\n * username: 'exampleuser',\n * maxResults: 50\n * })\n *\n * // Filter in code - posts from last 30 days with high engagement\n * const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)\n * const recentPopular = posts.filter(p =>\n * new Date(p.timestamp).getTime() > thirtyDaysAgo &&\n * p.likesCount > 1000\n * )\n *\n * // Only filtered results reach model context!\n * console.log(recentPopular)\n * ```\n */\nexport async function scrapeInstagramPosts(\n input: InstagramPostsInput,\n options?: ActorRunOptions\n): Promise\u003cInstagramPost[]> {\n const apify = new Apify()\n\n // Use the main Instagram scraper for posts\n const run = await apify.callActor('apify/instagram-post-scraper', {\n usernames: [input.username],\n resultsLimit: input.maxResults || 50\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Instagram posts scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map(transformPost)\n}\n\n/**\n * Scrape Instagram posts by hashtag\n *\n * @param input - Hashtag scraping options\n * @param options - Actor run options\n * @returns Array of Instagram posts with that hashtag\n *\n * @example\n * ```typescript\n * // Scrape top 100 posts for a hashtag\n * const posts = await scrapeInstagramHashtag({\n * hashtag: 'ai',\n * maxResults: 100\n * })\n *\n * // Filter in code - only videos with high views\n * const popularVideos = posts.filter(p =>\n * p.type === 'Video' &&\n * p.likesCount > 5000\n * ).slice(0, 10)\n * ```\n */\nexport async function scrapeInstagramHashtag(\n input: InstagramHashtagInput,\n options?: ActorRunOptions\n): Promise\u003cInstagramPost[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apify/instagram-hashtag-scraper', {\n hashtags: [input.hashtag],\n resultsLimit: input.maxResults || 100\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Instagram hashtag scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map(transformPost)\n}\n\n/**\n * Scrape Instagram comments from a post\n *\n * @param input - Comment scraping options\n * @param options - Actor run options\n * @returns Array of comments\n *\n * @example\n * ```typescript\n * const comments = await scrapeInstagramComments({\n * postUrl: 'https://www.instagram.com/p/ABC123/',\n * maxResults: 100\n * })\n *\n * // Filter in code - only comments with likes\n * const popularComments = comments\n * .filter(c => c.likesCount > 10)\n * .sort((a, b) => b.likesCount - a.likesCount)\n * .slice(0, 10)\n * ```\n */\nexport async function scrapeInstagramComments(\n input: InstagramCommentInput,\n options?: ActorRunOptions\n): Promise\u003cInstagramComment[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apify/instagram-comment-scraper', {\n directUrls: [input.postUrl],\n resultsLimit: input.maxResults || 100\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Instagram comments scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((item: any) => ({\n id: item.id,\n text: item.text,\n timestamp: item.timestamp,\n likesCount: item.likesCount || 0,\n ownerUsername: item.ownerUsername,\n ownerProfilePicUrl: item.ownerProfilePicUrl\n }))\n}\n\n/* ============================================================================\n * HELPERS\n * ========================================================================= */\n\n/**\n * Transform raw Instagram post data to our standard format\n */\nfunction transformPost(post: any): InstagramPost {\n return {\n id: post.id,\n shortCode: post.shortCode,\n url: post.url || `https://www.instagram.com/p/${post.shortCode}/`,\n caption: post.caption,\n imageUrl: post.displayUrl || post.imageUrl,\n videoUrl: post.videoUrl,\n likesCount: post.likesCount || 0,\n commentsCount: post.commentsCount || 0,\n viewsCount: post.videoViewCount,\n timestamp: post.timestamp,\n type: post.type || (post.videoUrl ? 'Video' : 'Image'),\n location: post.locationName ? {\n name: post.locationName,\n slug: post.locationSlug\n } : undefined,\n hashtags: post.hashtags,\n mentions: post.mentions,\n isSponsored: post.isSponsored,\n text: post.caption,\n author: post.ownerUsername ? {\n username: post.ownerUsername,\n fullName: post.ownerFullName\n } : undefined\n }\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":9689,"content_sha256":"e71b18b711ebe761833623dc62a700c99ac7221c2e9cea910f74cddd46a5ff59"},{"filename":"actors/social-media/linkedin.ts","content":"/**\n * LinkedIn Scraper\n *\n * Top Actors:\n * - dev_fusion/Linkedin-Profile-Scraper (26,635 users, 4.10 rating, $10/1k results)\n * - curious_coder/linkedin-jobs-scraper (9,430 users, 4.98 rating, $1/1k results)\n * - supreme_coder/linkedin-post (3,663 users, 4.16 rating, $0.001/post)\n *\n * Extract LinkedIn profiles, jobs, posts, company data without cookies.\n */\n\nimport { Apify } from '../../index'\nimport type {\n UserProfile,\n Post,\n PaginationOptions,\n ActorRunOptions,\n Location,\n ContactInfo\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface LinkedInProfileInput {\n /** LinkedIn profile URL */\n profileUrl: string\n /** Include email extraction (requires website visit) */\n includeEmail?: boolean\n}\n\nexport interface LinkedInProfile extends UserProfile {\n fullName: string\n headline?: string\n location?: string\n about?: string\n profileUrl: string\n company?: string\n position?: string\n email?: string\n phone?: string\n website?: string\n connectionsCount?: number\n skills?: string[]\n experience?: LinkedInExperience[]\n education?: LinkedInEducation[]\n languages?: string[]\n}\n\nexport interface LinkedInExperience {\n title: string\n company: string\n location?: string\n startDate?: string\n endDate?: string\n description?: string\n duration?: string\n}\n\nexport interface LinkedInEducation {\n school: string\n degree?: string\n field?: string\n startYear?: number\n endYear?: number\n}\n\nexport interface LinkedInJobsInput extends PaginationOptions {\n /** Job search keywords */\n keywords: string\n /** Location (e.g., \"San Francisco, CA\") */\n location?: string\n /** Maximum number of jobs to scrape */\n maxResults?: number\n /** Date posted filter (\"past-24h\", \"past-week\", \"past-month\", \"any\") */\n datePosted?: string\n /** Experience level filter */\n experienceLevel?: string[]\n /** Remote filter */\n remote?: boolean\n}\n\nexport interface LinkedInJob {\n id: string\n title: string\n company: string\n companyUrl?: string\n companyLogo?: string\n location: string\n description: string\n postedDate: string\n applicants?: string\n jobUrl: string\n seniority?: string\n employmentType?: string\n jobFunctions?: string[]\n industries?: string[]\n salary?: string\n}\n\nexport interface LinkedInPostsInput extends PaginationOptions {\n /** LinkedIn profile or company URL */\n profileUrl: string\n /** Maximum number of posts to scrape */\n maxResults?: number\n}\n\nexport interface LinkedInPost extends Post {\n id: string\n url: string\n text: string\n authorName?: string\n authorUrl?: string\n authorHeadline?: string\n likesCount: number\n commentsCount: number\n sharesCount?: number\n timestamp: string\n imageUrls?: string[]\n videoUrl?: string\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape LinkedIn profile data including email\n *\n * @param input - Profile scraping options\n * @param options - Actor run options\n * @returns LinkedIn profile data\n *\n * @example\n * ```typescript\n * // Scrape profile with email\n * const profile = await scrapeLinkedInProfile({\n * profileUrl: 'https://www.linkedin.com/in/exampleuser',\n * includeEmail: true\n * })\n *\n * console.log(`${profile.fullName} - ${profile.headline}`)\n * console.log(`Email: ${profile.email}`)\n * ```\n */\nexport async function scrapeLinkedInProfile(\n input: LinkedInProfileInput,\n options?: ActorRunOptions\n): Promise\u003cLinkedInProfile> {\n const apify = new Apify()\n\n const run = await apify.callActor('dev_fusion/Linkedin-Profile-Scraper', {\n urls: [input.profileUrl],\n includeEmail: input.includeEmail || false\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`LinkedIn profile scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit: 1 })\n\n if (items.length === 0) {\n throw new Error(`Profile not found: ${input.profileUrl}`)\n }\n\n const profile = items[0]\n\n return {\n fullName: profile.fullName || profile.name,\n headline: profile.headline,\n bio: profile.about,\n about: profile.about,\n location: profile.location,\n profileUrl: input.profileUrl,\n company: profile.company,\n position: profile.position || profile.headline,\n email: profile.email,\n phone: profile.phone,\n website: profile.website,\n connectionsCount: profile.connections,\n followersCount: profile.followers,\n skills: profile.skills,\n experience: profile.experience,\n education: profile.education,\n languages: profile.languages\n }\n}\n\n/**\n * Search LinkedIn jobs\n *\n * @param input - Job search parameters\n * @param options - Actor run options\n * @returns Array of LinkedIn jobs\n *\n * @example\n * ```typescript\n * // Search for remote AI jobs\n * const jobs = await searchLinkedInJobs({\n * keywords: 'artificial intelligence engineer',\n * location: 'United States',\n * remote: true,\n * maxResults: 100\n * })\n *\n * // Filter in code - only senior roles with high applicants\n * const competitiveRoles = jobs.filter(j =>\n * j.seniority?.includes('Senior') &&\n * parseInt(j.applicants || '0') > 100\n * )\n * ```\n */\nexport async function searchLinkedInJobs(\n input: LinkedInJobsInput,\n options?: ActorRunOptions\n): Promise\u003cLinkedInJob[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('curious_coder/linkedin-jobs-scraper', {\n keyword: input.keywords,\n location: input.location,\n maxItems: input.maxResults || 50,\n datePosted: input.datePosted,\n experienceLevel: input.experienceLevel,\n remote: input.remote\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`LinkedIn jobs scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((job: any) => ({\n id: job.jobId || job.id,\n title: job.title,\n company: job.company,\n companyUrl: job.companyUrl,\n companyLogo: job.companyLogo,\n location: job.location,\n description: job.description,\n postedDate: job.postedDate || job.postedAt,\n applicants: job.applicants,\n jobUrl: job.jobUrl || job.url,\n seniority: job.seniority,\n employmentType: job.employmentType,\n jobFunctions: job.jobFunctions,\n industries: job.industries,\n salary: job.salary,\n url: job.jobUrl || job.url\n }))\n}\n\n/**\n * Scrape LinkedIn posts from a profile or company\n *\n * @param input - Posts scraping options\n * @param options - Actor run options\n * @returns Array of LinkedIn posts\n *\n * @example\n * ```typescript\n * // Scrape latest posts\n * const posts = await scrapeLinkedInPosts({\n * profileUrl: 'https://www.linkedin.com/in/exampleuser',\n * maxResults: 50\n * })\n *\n * // Filter in code - only high-engagement posts\n * const viral = posts.filter(p =>\n * p.likesCount > 100 || p.commentsCount > 20\n * )\n * ```\n */\nexport async function scrapeLinkedInPosts(\n input: LinkedInPostsInput,\n options?: ActorRunOptions\n): Promise\u003cLinkedInPost[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('supreme_coder/linkedin-post', {\n urls: [input.profileUrl],\n maxPosts: input.maxResults || 50\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`LinkedIn posts scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((post: any) => ({\n id: post.id || post.postId,\n url: post.url || post.postUrl,\n text: post.text || post.content,\n authorName: post.authorName,\n authorUrl: post.authorUrl,\n authorHeadline: post.authorHeadline,\n likesCount: post.likesCount || post.likes || 0,\n commentsCount: post.commentsCount || post.comments || 0,\n sharesCount: post.sharesCount || post.shares,\n viewsCount: post.viewsCount,\n timestamp: post.timestamp || post.postedAt,\n imageUrls: post.images,\n videoUrl: post.videoUrl,\n caption: post.text\n }))\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":8615,"content_sha256":"e2207e1140fa39f11459b5805c61e30b251d6dae62a2e8ada30ddf61b2c0f7c7"},{"filename":"actors/social-media/tiktok.ts","content":"/**\n * TikTok Scraper\n *\n * Top Actors:\n * - clockworks/tiktok-scraper (90,141 users, 4.61 rating)\n * - scraptik/tiktok-api (1,329 users, 4.68 rating, $0.002/request - LOWEST COST)\n *\n * Extract TikTok profiles, videos, hashtags, comments without login.\n */\n\nimport { Apify } from '../../index'\nimport type {\n UserProfile,\n Post,\n EngagementMetrics,\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface TikTokProfileInput {\n /** TikTok username (without @) */\n username: string\n /** Maximum number of videos to include */\n maxVideos?: number\n}\n\nexport interface TikTokProfile extends UserProfile {\n id: string\n username: string\n nickname?: string\n signature?: string\n verified?: boolean\n followersCount: number\n followingCount: number\n heartCount?: number\n videoCount?: number\n videos?: TikTokVideo[]\n}\n\nexport interface TikTokVideo extends Post {\n id: string\n url: string\n text?: string\n desc?: string\n createTime: string\n videoUrl?: string\n coverUrl?: string\n playCount?: number\n likeCount: number\n commentCount: number\n shareCount: number\n downloadCount?: number\n musicTitle?: string\n musicAuthor?: string\n authorUsername?: string\n authorNickname?: string\n hashtags?: string[]\n mentions?: string[]\n isAd?: boolean\n}\n\nexport interface TikTokHashtagInput extends PaginationOptions {\n /** Hashtag (without #) */\n hashtag: string\n /** Maximum number of videos to scrape */\n maxResults?: number\n}\n\nexport interface TikTokCommentsInput extends PaginationOptions {\n /** TikTok video URL */\n videoUrl: string\n /** Maximum number of comments to scrape */\n maxResults?: number\n}\n\nexport interface TikTokComment {\n id: string\n text: string\n createTime: string\n likeCount: number\n replyCount?: number\n username: string\n userNickname?: string\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape TikTok profile data\n *\n * @param input - Profile scraping options\n * @param options - Actor run options\n * @returns TikTok profile with videos\n *\n * @example\n * ```typescript\n * const profile = await scrapeTikTokProfile({\n * username: 'exampleuser',\n * maxVideos: 30\n * })\n *\n * // Filter in code - only viral videos\n * const viral = profile.videos?.filter(v => v.playCount > 1000000)\n * ```\n */\nexport async function scrapeTikTokProfile(\n input: TikTokProfileInput,\n options?: ActorRunOptions\n): Promise\u003cTikTokProfile> {\n const apify = new Apify()\n\n const run = await apify.callActor('clockworks/tiktok-profile-scraper', {\n profiles: [`https://www.tiktok.com/@${input.username}`],\n resultsPerPage: input.maxVideos || 30\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`TikTok profile scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit: 1 })\n\n if (items.length === 0) {\n throw new Error(`Profile not found: @${input.username}`)\n }\n\n const profile = items[0]\n\n return {\n id: profile.authorMeta?.id,\n username: input.username,\n nickname: profile.authorMeta?.name,\n fullName: profile.authorMeta?.name,\n bio: profile.authorMeta?.signature,\n signature: profile.authorMeta?.signature,\n verified: profile.authorMeta?.verified,\n followersCount: profile.authorMeta?.fans || 0,\n followingCount: profile.authorMeta?.following || 0,\n heartCount: profile.authorMeta?.heart,\n videoCount: profile.authorMeta?.video,\n videos: profile.posts?.map(transformVideo)\n }\n}\n\n/**\n * Scrape TikTok videos by hashtag\n *\n * @param input - Hashtag scraping options\n * @param options - Actor run options\n * @returns Array of TikTok videos\n *\n * @example\n * ```typescript\n * const videos = await scrapeTikTokHashtag({\n * hashtag: 'ai',\n * maxResults: 100\n * })\n *\n * // Filter in code - only high engagement\n * const topVideos = videos\n * .filter(v => v.likeCount > 10000)\n * .sort((a, b) => b.likeCount - a.likeCount)\n * .slice(0, 10)\n * ```\n */\nexport async function scrapeTikTokHashtag(\n input: TikTokHashtagInput,\n options?: ActorRunOptions\n): Promise\u003cTikTokVideo[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('clockworks/tiktok-hashtag-scraper', {\n hashtags: [input.hashtag],\n resultsPerPage: input.maxResults || 100\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`TikTok hashtag scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map(transformVideo)\n}\n\n/**\n * Scrape TikTok comments from a video\n *\n * @param input - Comment scraping options\n * @param options - Actor run options\n * @returns Array of comments\n *\n * @example\n * ```typescript\n * const comments = await scrapeTikTokComments({\n * videoUrl: 'https://www.tiktok.com/@user/video/123',\n * maxResults: 200\n * })\n *\n * // Filter in code - only popular comments\n * const popular = comments.filter(c => c.likeCount > 50)\n * ```\n */\nexport async function scrapeTikTokComments(\n input: TikTokCommentsInput,\n options?: ActorRunOptions\n): Promise\u003cTikTokComment[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('clockworks/tiktok-comments-scraper', {\n postURLs: [input.videoUrl],\n maxComments: input.maxResults || 100\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`TikTok comments scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((comment: any) => ({\n id: comment.cid,\n text: comment.text,\n createTime: comment.createTime,\n likeCount: comment.diggCount || 0,\n replyCount: comment.replyCommentTotal,\n username: comment.user?.uniqueId,\n userNickname: comment.user?.nickname\n }))\n}\n\n/* ============================================================================\n * HELPERS\n * ========================================================================= */\n\nfunction transformVideo(video: any): TikTokVideo {\n return {\n id: video.id,\n url: video.webVideoUrl || `https://www.tiktok.com/@${video.authorMeta?.name}/video/${video.id}`,\n text: video.text,\n desc: video.text,\n caption: video.text,\n createTime: video.createTime,\n timestamp: video.createTime,\n videoUrl: video.videoUrl,\n coverUrl: video.covers?.default,\n playCount: video.playCount,\n viewsCount: video.playCount,\n likeCount: video.diggCount || 0,\n likesCount: video.diggCount || 0,\n commentCount: video.commentCount || 0,\n commentsCount: video.commentCount || 0,\n shareCount: video.shareCount || 0,\n sharesCount: video.shareCount || 0,\n downloadCount: video.downloadCount,\n musicTitle: video.musicMeta?.musicName,\n musicAuthor: video.musicMeta?.musicAuthor,\n authorUsername: video.authorMeta?.name,\n authorNickname: video.authorMeta?.nickName,\n hashtags: video.hashtags?.map((h: any) => h.name),\n mentions: video.mentions,\n isAd: video.isAd\n }\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":7731,"content_sha256":"69413cc9ed12fbdcf5edacc120f44e5d05ac753ed01184a6c054acd53edefb84"},{"filename":"actors/social-media/twitter.ts","content":"/**\n * Twitter/X Scraper\n *\n * Top Actor:\n * - apidojo/twitter-scraper-lite (Unlimited, no rate limits, event-based pricing)\n *\n * Extract Twitter/X profiles, tweets, followers, and search results.\n */\n\nimport { Apify } from '../../index'\nimport type {\n UserProfile,\n Post,\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface TwitterProfileInput {\n /** Twitter username (without @) */\n username: string\n /** Include tweets in profile response */\n includeTweets?: boolean\n /** Maximum number of tweets to fetch */\n maxTweets?: number\n}\n\nexport interface TwitterProfile extends UserProfile {\n username: string\n displayName: string\n bio?: string\n location?: string\n website?: string\n profileImageUrl?: string\n bannerImageUrl?: string\n followersCount?: number\n followingCount?: number\n tweetsCount?: number\n verified?: boolean\n createdAt?: string\n latestTweets?: TwitterTweet[]\n}\n\nexport interface TwitterTweetsInput extends PaginationOptions {\n /** Twitter username (without @) */\n username: string\n /** Maximum number of tweets to scrape */\n maxTweets?: number\n /** Include replies */\n includeReplies?: boolean\n /** Include retweets */\n includeRetweets?: boolean\n}\n\nexport interface TwitterSearchInput extends PaginationOptions {\n /** Search query */\n query: string\n /** Maximum number of tweets to return */\n maxTweets?: number\n /** Search type: \"Latest\", \"Top\", \"People\", \"Photos\", \"Videos\" */\n searchType?: string\n}\n\nexport interface TwitterTweet extends Post {\n id: string\n url: string\n text: string\n authorUsername: string\n authorDisplayName: string\n timestamp: string\n likesCount: number\n retweetsCount: number\n repliesCount: number\n viewsCount?: number\n hashtags?: string[]\n mentions?: string[]\n imageUrls?: string[]\n videoUrl?: string\n isRetweet?: boolean\n isReply?: boolean\n quotedTweet?: TwitterTweet\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape Twitter/X profile data\n *\n * @param input - Profile scraping options\n * @param options - Actor run options\n * @returns Twitter profile data\n *\n * @example\n * ```typescript\n * // Scrape profile with latest tweets\n * const profile = await scrapeTwitterProfile({\n * username: 'exampleuser',\n * includeTweets: true,\n * maxTweets: 20\n * })\n *\n * console.log(`${profile.displayName} (@${profile.username})`)\n * console.log(`Followers: ${profile.followersCount}`)\n * console.log(`Latest tweets: ${profile.latestTweets?.length}`)\n * ```\n */\nexport async function scrapeTwitterProfile(\n input: TwitterProfileInput,\n options?: ActorRunOptions\n): Promise\u003cTwitterProfile> {\n const apify = new Apify()\n\n const run = await apify.callActor('apidojo/twitter-scraper-lite', {\n mode: 'profile',\n username: input.username,\n maxTweets: input.includeTweets ? (input.maxTweets || 20) : 0\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Twitter profile scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit: 100 })\n\n if (items.length === 0) {\n throw new Error(`Profile not found: @${input.username}`)\n }\n\n // First item is profile, rest are tweets\n const profileData = items[0]\n const tweets = items.slice(1)\n\n return {\n username: profileData.username || input.username,\n displayName: profileData.name || profileData.displayName,\n bio: profileData.description || profileData.bio,\n location: profileData.location,\n website: profileData.url || profileData.website,\n profileImageUrl: profileData.profileImageUrl,\n bannerImageUrl: profileData.bannerImageUrl,\n followersCount: profileData.followersCount || profileData.followers,\n followingCount: profileData.followingCount || profileData.following,\n tweetsCount: profileData.tweetsCount || profileData.tweets,\n verified: profileData.verified || profileData.isVerified,\n createdAt: profileData.createdAt,\n latestTweets: tweets.map(mapToTwitterTweet)\n }\n}\n\n/**\n * Scrape tweets from a Twitter/X user\n *\n * @param input - Tweets scraping options\n * @param options - Actor run options\n * @returns Array of tweets\n *\n * @example\n * ```typescript\n * // Get latest tweets\n * const tweets = await scrapeTwitterTweets({\n * username: 'exampleuser',\n * maxTweets: 100,\n * includeReplies: false\n * })\n *\n * // Filter in code - only high engagement\n * const viral = tweets.filter(t =>\n * t.likesCount > 100 || t.retweetsCount > 50\n * )\n * ```\n */\nexport async function scrapeTwitterTweets(\n input: TwitterTweetsInput,\n options?: ActorRunOptions\n): Promise\u003cTwitterTweet[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apidojo/twitter-scraper-lite', {\n mode: 'tweets',\n username: input.username,\n maxTweets: input.maxTweets || 100,\n includeReplies: input.includeReplies !== false,\n includeRetweets: input.includeRetweets !== false\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Twitter tweets scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxTweets || 1000,\n offset: input.offset || 0\n })\n\n return items.map(mapToTwitterTweet)\n}\n\n/**\n * Search Twitter/X for tweets\n *\n * @param input - Search parameters\n * @param options - Actor run options\n * @returns Array of tweets matching search\n *\n * @example\n * ```typescript\n * // Search for AI security tweets\n * const tweets = await searchTwitter({\n * query: 'AI security',\n * maxTweets: 50,\n * searchType: 'Latest'\n * })\n *\n * // Filter in code - only from verified users\n * const verifiedTweets = tweets.filter(t =>\n * t.authorVerified === true\n * )\n * ```\n */\nexport async function searchTwitter(\n input: TwitterSearchInput,\n options?: ActorRunOptions\n): Promise\u003cTwitterTweet[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('apidojo/twitter-scraper-lite', {\n mode: 'search',\n query: input.query,\n maxTweets: input.maxTweets || 100,\n searchType: input.searchType || 'Latest'\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Twitter search failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxTweets || 1000,\n offset: input.offset || 0\n })\n\n return items.map(mapToTwitterTweet)\n}\n\n/* ============================================================================\n * HELPERS\n * ========================================================================= */\n\nfunction mapToTwitterTweet(tweet: any): TwitterTweet {\n return {\n id: tweet.id || tweet.tweetId,\n url: tweet.url || `https://twitter.com/${tweet.authorUsername}/status/${tweet.id}`,\n text: tweet.text || tweet.fullText,\n authorUsername: tweet.authorUsername || tweet.username,\n authorDisplayName: tweet.authorName || tweet.displayName,\n timestamp: tweet.createdAt || tweet.timestamp,\n likesCount: tweet.likesCount || tweet.likes || 0,\n retweetsCount: tweet.retweetsCount || tweet.retweets || 0,\n repliesCount: tweet.repliesCount || tweet.replies || 0,\n viewsCount: tweet.viewsCount || tweet.views,\n commentsCount: tweet.repliesCount || tweet.replies || 0,\n hashtags: tweet.hashtags,\n mentions: tweet.mentions,\n imageUrls: tweet.media?.filter((m: any) => m.type === 'photo').map((m: any) => m.url),\n videoUrl: tweet.media?.find((m: any) => m.type === 'video')?.url,\n isRetweet: tweet.isRetweet,\n isReply: tweet.isReplyTo !== undefined,\n quotedTweet: tweet.quotedTweet ? mapToTwitterTweet(tweet.quotedTweet) : undefined,\n caption: tweet.text || tweet.fullText\n }\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":8295,"content_sha256":"c80173d5390b91f8e0e8f907d378757773f23cfc8cb487796e5f3adbef50a525"},{"filename":"actors/social-media/youtube.ts","content":"/**\n * YouTube Scraper\n *\n * Top Actors:\n * - streamers/youtube-scraper (40,455 users, 4.40 rating, $0.005/video)\n * - apidojo/youtube-scraper (4,336 users, 3.88 rating, $0.50/1k videos)\n *\n * Extract YouTube channels, videos, comments - no API quotas/limits!\n */\n\nimport { Apify } from '../../index'\nimport type {\n UserProfile,\n Post,\n EngagementMetrics,\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface YouTubeChannelInput {\n /** YouTube channel URL or ID */\n channelUrl: string\n /** Maximum number of videos to include */\n maxVideos?: number\n}\n\nexport interface YouTubeChannel extends UserProfile {\n id: string\n title: string\n url: string\n description?: string\n subscribersCount?: number\n videosCount?: number\n viewsCount?: number\n joinedDate?: string\n country?: string\n thumbnailUrl?: string\n bannerUrl?: string\n verified?: boolean\n videos?: YouTubeVideo[]\n}\n\nexport interface YouTubeVideo extends Post {\n id: string\n url: string\n title: string\n description?: string\n channelId?: string\n channelTitle?: string\n channelUrl?: string\n publishedAt: string\n viewsCount: number\n likesCount?: number\n commentsCount?: number\n duration?: string\n thumbnailUrl?: string\n tags?: string[]\n category?: string\n}\n\nexport interface YouTubeSearchInput extends PaginationOptions {\n /** Search query */\n query: string\n /** Maximum number of videos */\n maxResults?: number\n /** Upload date filter */\n uploadDate?: 'hour' | 'today' | 'week' | 'month' | 'year'\n /** Duration filter */\n duration?: 'short' | 'medium' | 'long'\n /** Sort by */\n sortBy?: 'relevance' | 'date' | 'viewCount' | 'rating'\n}\n\nexport interface YouTubeCommentsInput extends PaginationOptions {\n /** YouTube video URL or ID */\n videoUrl: string\n /** Maximum number of comments */\n maxResults?: number\n}\n\nexport interface YouTubeComment {\n id: string\n text: string\n authorName: string\n authorChannelUrl?: string\n likesCount: number\n replyCount?: number\n publishedAt: string\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape YouTube channel data\n *\n * @param input - Channel scraping options\n * @param options - Actor run options\n * @returns YouTube channel with videos\n *\n * @example\n * ```typescript\n * const channel = await scrapeYouTubeChannel({\n * channelUrl: 'https://www.youtube.com/@exampleuser',\n * maxVideos: 50\n * })\n *\n * // Filter in code - only high-performing videos\n * const topVideos = channel.videos\n * ?.filter(v => v.viewsCount > 10000)\n * .sort((a, b) => b.viewsCount - a.viewsCount)\n * .slice(0, 10)\n * ```\n */\nexport async function scrapeYouTubeChannel(\n input: YouTubeChannelInput,\n options?: ActorRunOptions\n): Promise\u003cYouTubeChannel> {\n const apify = new Apify()\n\n const run = await apify.callActor('streamers/youtube-channel-scraper', {\n startUrls: [input.channelUrl],\n maxResults: input.maxVideos || 50\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`YouTube channel scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems()\n\n if (items.length === 0) {\n throw new Error(`Channel not found: ${input.channelUrl}`)\n }\n\n // First item is channel info, rest are videos\n const channelData = items[0]\n const videos = items.slice(1).map(transformVideo)\n\n return {\n id: channelData.channelId,\n title: channelData.title,\n fullName: channelData.title,\n url: channelData.url || input.channelUrl,\n description: channelData.description,\n bio: channelData.description,\n subscribersCount: channelData.numberOfSubscribers,\n followersCount: channelData.numberOfSubscribers,\n videosCount: channelData.numberOfVideos,\n viewsCount: channelData.numberOfViews,\n joinedDate: channelData.joinedDate,\n country: channelData.country,\n thumbnailUrl: channelData.thumbnailUrl,\n bannerUrl: channelData.bannerUrl,\n verified: channelData.verified,\n videos\n }\n}\n\n/**\n * Search YouTube videos\n *\n * @param input - Search parameters\n * @param options - Actor run options\n * @returns Array of YouTube videos\n *\n * @example\n * ```typescript\n * const videos = await searchYouTube({\n * query: 'artificial intelligence tutorial',\n * maxResults: 100,\n * uploadDate: 'month',\n * sortBy: 'viewCount'\n * })\n *\n * // Filter in code - only videos with high engagement\n * const engaging = videos.filter(v =>\n * v.viewsCount > 50000 &&\n * (v.likesCount || 0) > 1000\n * )\n * ```\n */\nexport async function searchYouTube(\n input: YouTubeSearchInput,\n options?: ActorRunOptions\n): Promise\u003cYouTubeVideo[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('streamers/youtube-scraper', {\n searchKeywords: input.query,\n maxResults: input.maxResults || 50,\n uploadDate: input.uploadDate,\n videoDuration: input.duration,\n sortBy: input.sortBy || 'relevance'\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`YouTube search failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map(transformVideo)\n}\n\n/**\n * Scrape YouTube comments from a video\n *\n * @param input - Comment scraping options\n * @param options - Actor run options\n * @returns Array of comments\n *\n * @example\n * ```typescript\n * const comments = await scrapeYouTubeComments({\n * videoUrl: 'https://www.youtube.com/watch?v=ABC123',\n * maxResults: 500\n * })\n *\n * // Filter in code - only highly-liked comments\n * const popular = comments\n * .filter(c => c.likesCount > 100)\n * .sort((a, b) => b.likesCount - a.likesCount)\n * ```\n */\nexport async function scrapeYouTubeComments(\n input: YouTubeCommentsInput,\n options?: ActorRunOptions\n): Promise\u003cYouTubeComment[]> {\n const apify = new Apify()\n\n const run = await apify.callActor('streamers/youtube-comments-scraper', {\n startUrls: [input.videoUrl],\n maxComments: input.maxResults || 100\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`YouTube comments scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxResults || 1000,\n offset: input.offset || 0\n })\n\n return items.map((comment: any) => ({\n id: comment.id,\n text: comment.text,\n authorName: comment.authorText,\n authorChannelUrl: comment.authorChannelUrl,\n likesCount: comment.likesCount || 0,\n replyCount: comment.replyCount,\n publishedAt: comment.publishedTimeText\n }))\n}\n\n/* ============================================================================\n * HELPERS\n * ========================================================================= */\n\nfunction transformVideo(video: any): YouTubeVideo {\n return {\n id: video.id,\n url: video.url || `https://www.youtube.com/watch?v=${video.id}`,\n title: video.title,\n text: video.title,\n description: video.description,\n channelId: video.channelId,\n channelTitle: video.channelName || video.channelTitle,\n channelUrl: video.channelUrl,\n publishedAt: video.date || video.publishedAt,\n timestamp: video.date || video.publishedAt,\n viewsCount: video.views || video.viewsCount || 0,\n likesCount: video.likes || video.likesCount,\n commentsCount: video.numberOfComments || video.commentsCount,\n duration: video.duration,\n thumbnailUrl: video.thumbnail || video.thumbnailUrl,\n tags: video.tags,\n category: video.category\n }\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":8167,"content_sha256":"74648bf89b3db6301158d1112c2b0f1c5ba3a3c074a138f4142dd9e585840bf5"},{"filename":"actors/web/index.ts","content":"/**\n * Web Scraping Actors\n *\n * - Web Scraper (94k users - general purpose)\n */\n\nexport * from './web-scraper'\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":112,"content_sha256":"d8ebb9d8a5cd94a71f33e3c8fddb12c63e41580eb83ec02add0395159545549f"},{"filename":"actors/web/web-scraper.ts","content":"/**\n * Web Scraper (General Purpose)\n *\n * Apify Actor: apify/web-scraper (94,522 users, 4.39 rating)\n * Pricing: FREE - only pay for Apify platform usage\n *\n * Crawl any website and extract structured data using JavaScript functions.\n * Most versatile actor - handles ANY website!\n */\n\nimport { Apify } from '../../index'\nimport type {\n PaginationOptions,\n ActorRunOptions\n} from '../../types'\n\n/* ============================================================================\n * TYPES\n * ========================================================================= */\n\nexport interface WebScraperInput {\n /** URLs to start crawling from */\n startUrls: string[]\n /** JavaScript function to extract data from each page */\n pageFunction?: string\n /** CSS selector for links to follow */\n linkSelector?: string\n /** Pseudo-URLs to match for crawling */\n pseudoUrls?: string[]\n /** Maximum pages to crawl */\n maxPagesPerCrawl?: number\n /** Maximum crawling depth */\n maxCrawlingDepth?: number\n /** Proxy configuration */\n useProxy?: boolean\n /** Wait for dynamic content (ms) */\n waitUntil?: 'load' | 'domcontentloaded' | 'networkidle0' | 'networkidle2'\n}\n\nexport interface ScrapedPage {\n url: string\n title?: string\n html?: string\n text?: string\n [key: string]: any // Custom extracted data\n}\n\n/* ============================================================================\n * FUNCTIONS\n * ========================================================================= */\n\n/**\n * Scrape data from websites using custom extraction logic\n *\n * @param input - Web scraping configuration\n * @param options - Actor run options\n * @returns Array of scraped pages with extracted data\n *\n * @example\n * ```typescript\n * // Scrape product listings\n * const products = await scrapeWebsite({\n * startUrls: ['https://example.com/products'],\n * linkSelector: 'a.product-link',\n * maxPagesPerCrawl: 100,\n * pageFunction: `\n * async function pageFunction(context) {\n * const { request, $, log } = context\n *\n * // Extract data using jQuery-like selectors\n * return {\n * url: request.url,\n * title: $('h1.product-title').text(),\n * price: $('span.price').text(),\n * description: $('.description').text(),\n * inStock: $('.in-stock').length > 0\n * }\n * }\n * `\n * })\n *\n * // Filter in code - only available products under $100\n * const affordable = products.filter(p =>\n * p.inStock &&\n * parseFloat(p.price.replace('

Important: agents should read /llm.txt, /llms.txt, or /.well-known/skills.json to discover the public Skillopedia API.

, '')) \u003c 100\n * )\n * ```\n *\n * @example\n * ```typescript\n * // Simple HTML/text extraction\n * const pages = await scrapeWebsite({\n * startUrls: ['https://blog.example.com'],\n * linkSelector: 'a.post-link',\n * maxPagesPerCrawl: 50,\n * pageFunction: `\n * async function pageFunction(context) {\n * const { request, $, log } = context\n *\n * return {\n * url: request.url,\n * title: $('h1').first().text(),\n * author: $('.author').text(),\n * date: $('.date').text(),\n * content: $('.post-content').text(),\n * tags: $('.tag').map((i, el) => $(el).text()).get()\n * }\n * }\n * `\n * })\n *\n * // Filter in code - only recent posts\n * const recent = pages.filter(p => {\n * const postDate = new Date(p.date)\n * const monthAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)\n * return postDate.getTime() > monthAgo\n * })\n * ```\n */\nexport async function scrapeWebsite(\n input: WebScraperInput,\n options?: ActorRunOptions\n): Promise\u003cScrapedPage[]> {\n const apify = new Apify()\n\n // Default page function that extracts basic data\n const defaultPageFunction = `\n async function pageFunction(context) {\n const { request, $, log } = context\n\n return {\n url: request.url,\n title: $('title').text() || $('h1').first().text(),\n text: $('body').text().trim()\n }\n }\n `\n\n const run = await apify.callActor('apify/web-scraper', {\n startUrls: input.startUrls.map(url => ({ url })),\n pageFunction: input.pageFunction || defaultPageFunction,\n linkSelector: input.linkSelector,\n pseudoUrls: input.pseudoUrls?.map(pattern => ({ purl: pattern })),\n maxPagesPerCrawl: input.maxPagesPerCrawl || 100,\n maxCrawlingDepth: input.maxCrawlingDepth || 0,\n useProxy: input.useProxy || false,\n waitUntil: input.waitUntil || 'networkidle2'\n }, options)\n\n await apify.waitForRun(run.id)\n\n const finalRun = await apify.getRun(run.id)\n if (finalRun.status !== 'SUCCEEDED') {\n throw new Error(`Web scraping failed: ${finalRun.status}`)\n }\n\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({\n limit: input.maxPagesPerCrawl || 10000\n })\n\n return items as ScrapedPage[]\n}\n\n/**\n * Extract structured data from a single page\n *\n * @param url - URL to scrape\n * @param pageFunction - JavaScript function to extract data\n * @param options - Actor run options\n * @returns Extracted data\n *\n * @example\n * ```typescript\n * // Scrape a single product page\n * const product = await scrapePage(\n * 'https://example.com/product/123',\n * `async function pageFunction(context) {\n * const { $, request } = context\n * return {\n * name: $('h1.product-name').text(),\n * price: $('span.price').text(),\n * rating: parseFloat($('.rating').attr('data-rating')),\n * reviews: parseInt($('.review-count').text()),\n * images: $('img.product-image')\n * .map((i, el) => $(el).attr('src'))\n * .get()\n * }\n * }`\n * )\n *\n * console.log(`${product.name} - ${product.price}`)\n * console.log(`Rating: ${product.rating}/5 (${product.reviews} reviews)`)\n * ```\n */\nexport async function scrapePage(\n url: string,\n pageFunction: string,\n options?: ActorRunOptions\n): Promise\u003cScrapedPage> {\n const results = await scrapeWebsite({\n startUrls: [url],\n maxPagesPerCrawl: 1,\n pageFunction\n }, options)\n\n if (results.length === 0) {\n throw new Error(`Failed to scrape page: ${url}`)\n }\n\n return results[0]\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":6020,"content_sha256":"37db669ab59d8abffcb52ba2bef2a8699d5a87ef429b3a6471404d56d71d822e"},{"filename":"examples/comparison-test.ts","content":"#!/usr/bin/env bun\n\n/**\n * Comparison Test: MCP vs Code-First Apify\n *\n * Demonstrates the difference in approach and token usage between\n * traditional MCP tool calls and code-first execution.\n */\n\nimport { Apify } from '../index'\n\n// Utility to estimate token count\nfunction estimateTokens(data: any): number {\n const str = JSON.stringify(data)\n // Rough estimate: ~4 characters per token\n return Math.ceil(str.length / 4)\n}\n\nasync function demonstrateMCPApproach() {\n console.log('=== MCP APPROACH ===\\n')\n console.log('Traditional MCP flow with multiple round-trips through model context:\\n')\n\n console.log('Step 1: mcp__Apify__search-actors')\n console.log(' Input: { search: \"instagram scraper\", limit: 10 }')\n console.log(' → Tool definitions loaded: ~5,000 tokens')\n console.log(' → Search results returned: ~1,000 tokens')\n console.log(' → Results pass through model context')\n\n console.log('\\nStep 2: mcp__Apify__call-actor')\n console.log(' Input: { actor: \"apify/instagram-scraper\", input: {...} }')\n console.log(' → Run information returned: ~1,000 tokens')\n console.log(' → Results pass through model context')\n\n console.log('\\nStep 3: mcp__Apify__get-actor-output')\n console.log(' Input: { datasetId: \"xyz123\" }')\n console.log(' → FULL dataset returned: ~50,000 tokens (100 items)')\n console.log(' → ALL results pass through model context')\n console.log(' → Model must filter in subsequent reasoning step')\n\n console.log('\\nStep 4: Model reasoning to filter')\n console.log(' → Additional model call to process and filter')\n console.log(' → Context includes all 100 items again')\n\n console.log('\\n📊 MCP Total Token Usage:')\n console.log(' Tool definitions: 5,000 tokens')\n console.log(' Search results: 1,000 tokens')\n console.log(' Run info: 1,000 tokens')\n console.log(' Full dataset: 50,000 tokens')\n console.log(' ────────────────────────────────')\n console.log(' TOTAL: ~57,000 tokens')\n console.log(' Plus additional reasoning overhead!\\n')\n}\n\nasync function demonstrateCodeFirstApproach() {\n console.log('=== CODE-FIRST APPROACH ===\\n')\n console.log('Direct code execution with in-code filtering:\\n')\n\n const apify = new Apify()\n\n console.log('Step 1: Model reads README.md for API discovery')\n console.log(' → README.md content: ~200 tokens')\n console.log(' → Progressive disclosure (only load what\\'s needed)')\n\n console.log('\\nStep 2: Model writes code to execute operations')\n const codeExample = `\nimport { Apify } from '~/.claude/filesystem-mcps/apify'\n\nconst apify = new Apify()\n\n// All operations in code - no intermediate context bloat\nconst actors = await apify.search(\"instagram scraper\")\nconst run = await apify.callActor(actors[0].id, {\n profiles: [\"target\"],\n resultsLimit: 100\n})\n\n// Wait for completion\nawait apify.waitForRun(actors[0].id, run.id)\n\n// Get dataset\nconst dataset = apify.getDataset(run.defaultDatasetId)\nconst items = await dataset.listItems()\n\n// CRITICAL: Filter in code BEFORE returning to model\nconst yesterday = Date.now() - 86400000\nconst filtered = items\n .filter(post => post.likesCount > 1000)\n .filter(post => post.timestamp > yesterday)\n .slice(0, 10)\n\n// Only 10 filtered results reach model context\nreturn filtered\n `.trim()\n\n console.log(' Code to execute (~300 tokens):')\n console.log(' ' + codeExample.split('\\n').join('\\n '))\n\n console.log('\\nStep 3: Code executes in bash environment')\n console.log(' → All operations happen locally')\n console.log(' → Intermediate results NEVER enter model context')\n console.log(' → Filtering happens in execution environment')\n\n console.log('\\nStep 4: Only filtered results return to model')\n console.log(' → Filtered dataset: 10 items (~500 tokens)')\n console.log(' → Model sees only what it needs')\n\n console.log('\\n📊 Code-First Total Token Usage:')\n console.log(' README discovery: 200 tokens')\n console.log(' Code execution: 300 tokens')\n console.log(' Filtered results: 500 tokens')\n console.log(' ────────────────────────────────')\n console.log(' TOTAL: ~1,000 tokens')\n console.log('\\n 💰 TOKEN SAVINGS: 98.2% reduction!')\n console.log(' ⚡ PERFORMANCE: Faster (no model round-trips)')\n console.log(' 🔒 PRIVACY: Intermediate data never in model context\\n')\n}\n\nasync function demonstrateFilteringComparison() {\n console.log('=== FILTERING COMPARISON ===\\n')\n\n // Simulate a dataset of 100 items\n const fullDataset = Array.from({ length: 100 }, (_, i) => ({\n id: `post_${i}`,\n username: `user${i}`,\n text: `This is post ${i} with some content`,\n likesCount: Math.floor(Math.random() * 5000),\n timestamp: Date.now() - Math.random() * 86400000 * 7,\n url: `https://instagram.com/p/${i}`\n }))\n\n // Filter to top 10 high-engagement recent posts\n const yesterday = Date.now() - 86400000\n const filtered = fullDataset\n .filter(post => post.likesCount > 1000)\n .filter(post => post.timestamp > yesterday)\n .sort((a, b) => b.likesCount - a.likesCount)\n .slice(0, 10)\n\n const fullTokens = estimateTokens(fullDataset)\n const filteredTokens = estimateTokens(filtered)\n const savings = ((fullTokens - filteredTokens) / fullTokens * 100).toFixed(1)\n\n console.log('Dataset Size Comparison:')\n console.log(` Full dataset: ${fullDataset.length} items (${fullTokens} tokens)`)\n console.log(` Filtered dataset: ${filtered.length} items (${filteredTokens} tokens)`)\n console.log(` Reduction: ${savings}% fewer tokens\\n`)\n\n console.log('MCP Approach:')\n console.log(' 1. Return all 100 items to model (${fullTokens} tokens)')\n console.log(' 2. Model reasons about filtering criteria')\n console.log(' 3. Model makes another call to filter')\n console.log(' 4. All 100 items in context again during filtering')\n console.log(` Total: ~${fullTokens * 2} tokens (dataset appears 2x in context)\\n`)\n\n console.log('Code-First Approach:')\n console.log(' 1. Filter executed in code environment')\n console.log(' 2. Only 10 items returned to model')\n console.log(` Total: ~${filteredTokens} tokens\\n`)\n\n console.log(`💡 Key Insight: Code-first prevents ${fullDataset.length - filtered.length} irrelevant items`)\n console.log(' from ever entering the model context!\\n')\n}\n\nasync function main() {\n console.log('\\n╔═══════════════════════════════════════════════════════════╗')\n console.log('║ MCP vs Code-First Comparison: Apify Integration ║')\n console.log('╚═══════════════════════════════════════════════════════════╝\\n')\n\n await demonstrateMCPApproach()\n console.log('\\n' + '─'.repeat(60) + '\\n')\n\n await demonstrateCodeFirstApproach()\n console.log('\\n' + '─'.repeat(60) + '\\n')\n\n await demonstrateFilteringComparison()\n console.log('\\n' + '─'.repeat(60) + '\\n')\n\n console.log('=== CONCLUSION ===\\n')\n console.log('Code-first Apify integration provides:')\n console.log(' ✅ 98%+ token reduction through in-code filtering')\n console.log(' ✅ Faster execution (no model round-trips for control flow)')\n console.log(' ✅ Better privacy (intermediate data stays in execution env)')\n console.log(' ✅ Progressive disclosure (load only what you need)')\n console.log(' ✅ More maintainable (standard TypeScript, not tool schemas)\\n')\n\n console.log('When to use:')\n console.log(' • Data-heavy operations (scraping, large datasets)')\n console.log(' • Operations requiring filtering/transformation')\n console.log(' • Multiple sequential operations')\n console.log(' • Privacy-sensitive workflows\\n')\n}\n\n// Run if executed directly\nif (import.meta.main) {\n main()\n}\n\nexport { main }\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":8080,"content_sha256":"04bd05632154fc95bda3041f5493689a9f48b22803fd4b1c5ba1035642c75499"},{"filename":"examples/instagram-scraper.ts","content":"#!/usr/bin/env bun\n\n/**\n * Example: Instagram Scraper with Code-First Apify\n *\n * Demonstrates token savings through in-code filtering:\n * - MCP approach: ~57,000 tokens\n * - Code-first: ~1,000 tokens (98.2% reduction)\n */\n\nimport { Apify } from '../index'\n\nasync function main() {\n console.log('=== Apify Code-First Example: Instagram Scraper ===\\n')\n\n // Initialize client (uses APIFY_TOKEN from environment)\n const apify = new Apify()\n\n try {\n // Step 1: Search for Instagram scraper actors\n console.log('1. Searching for Instagram scraper actors...')\n const actors = await apify.search('instagram scraper', { limit: 5 })\n\n console.log(` Found ${actors.length} actors:`)\n actors.forEach((actor, i) => {\n console.log(` ${i + 1}. ${actor.username}/${actor.name}`)\n console.log(` ${actor.title}`)\n console.log(` Stats: ${actor.stats.runs.total} runs, ${actor.stats.users.total} users\\n`)\n })\n\n // Select the most popular actor\n const selectedActor = actors[0]\n console.log(` Selected: ${selectedActor.username}/${selectedActor.name}\\n`)\n\n // Step 2: Call the actor (execute scraping)\n console.log('2. Calling actor to scrape Instagram profiles...')\n console.log(' (This is a dry run - modify input for real scraping)')\n\n // Example input - modify for actual use\n const input = {\n // Instagram profile usernames to scrape\n profiles: ['example'],\n\n // Limit results to avoid excessive runtime/costs\n resultsLimit: 50,\n\n // Other common options\n // searchLimit: 10,\n // proxy: { useApifyProxy: true }\n }\n\n console.log(' Input:', JSON.stringify(input, null, 2))\n console.log(' Note: Using dry run mode (not actually executing)\\n')\n\n // Uncomment to actually run:\n // const run = await apify.callActor(selectedActor.id, input, {\n // memory: 2048,\n // timeout: 300\n // })\n //\n // console.log(` Run started: ${run.id}`)\n // console.log(` Status: ${run.status}`)\n // console.log(` Container URL: ${run.containerUrl}\\n`)\n //\n // // Step 3: Wait for completion\n // console.log('3. Waiting for actor run to complete...')\n // await apify.waitForRun(selectedActor.id, run.id, { waitSecs: 300 })\n //\n // const finalRun = await apify.getRun(selectedActor.id, run.id)\n // console.log(` Final status: ${finalRun.status}`)\n //\n // if (finalRun.status !== 'SUCCEEDED') {\n // console.error(' Actor run failed!')\n // process.exit(1)\n // }\n //\n // // Step 4: Get dataset and filter results IN CODE\n // console.log('\\n4. Fetching and filtering results...')\n // const dataset = apify.getDataset(finalRun.defaultDatasetId)\n //\n // // Get all items\n // const allItems = await dataset.listItems({ limit: 100 })\n // console.log(` Total items retrieved: ${allItems.length}`)\n //\n // // KEY: Filter in code BEFORE returning to model context\n // const yesterday = Date.now() - 86400000 // 24 hours ago\n // const filtered = allItems\n // .filter(post => post.likesCount > 1000) // High engagement\n // .filter(post => post.timestamp > yesterday) // Recent\n // .sort((a, b) => b.likesCount - a.likesCount) // Top first\n // .slice(0, 10) // Top 10\n //\n // console.log(` Filtered to top ${filtered.length} high-engagement recent posts\\n`)\n //\n // // Step 5: Show token savings\n // const estimateTokens = (data: any) => {\n // return Math.ceil(JSON.stringify(data).length / 4)\n // }\n //\n // const mcpTokens = estimateTokens(allItems)\n // const codeTokens = estimateTokens(filtered)\n // const savings = ((mcpTokens - codeTokens) / mcpTokens * 100).toFixed(1)\n //\n // console.log('=== Token Savings ===')\n // console.log(`MCP approach (all items): ~${mcpTokens} tokens`)\n // console.log(`Code-first (filtered): ~${codeTokens} tokens`)\n // console.log(`Savings: ${savings}%`)\n //\n // // Return filtered results (only these reach model context)\n // return filtered\n\n console.log('3. Dry run complete!')\n console.log(' Uncomment the code above to actually execute the scraper.')\n console.log(' Make sure to:')\n console.log(' - Set valid Instagram profile usernames')\n console.log(' - Have sufficient Apify credits')\n console.log(' - Review actor documentation for input schema\\n')\n\n } catch (error) {\n console.error('Error:', error instanceof Error ? error.message : error)\n process.exit(1)\n }\n}\n\n// Run if executed directly\nif (import.meta.main) {\n main()\n}\n\nexport { main }\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":4616,"content_sha256":"eb19d922183e448e90693ff90414dac066c642bde2e7308f190bc639c1223e9c"},{"filename":"examples/smoke-test.ts","content":"#!/usr/bin/env bun\n\n/**\n * Smoke Test: Verify Apify Code-First API Works\n *\n * Tests basic functionality without executing expensive operations.\n */\n\nimport { Apify } from '../index'\n\nasync function main() {\n console.log('=== Apify Code-First Smoke Test ===\\n')\n\n if (!process.env.APIFY_TOKEN && !process.env.APIFY_API_KEY) {\n console.error('❌ APIFY_TOKEN or APIFY_API_KEY not set in environment')\n console.error(' Add to ${PAI_DIR}/.env: APIFY_TOKEN=apify_api_xxxxx')\n console.error(' Or: APIFY_API_KEY=apify_api_xxxxx\\n')\n process.exit(1)\n }\n\n const apify = new Apify()\n\n try {\n // Test 1: Search for actors\n console.log('Test 1: Searching for actors...')\n const actors = await apify.search('web scraper', { limit: 3 })\n\n if (actors.length === 0) {\n console.error('❌ No actors found - API may not be working')\n process.exit(1)\n }\n\n console.log(`✅ Found ${actors.length} actors:`)\n actors.forEach((actor, i) => {\n console.log(` ${i + 1}. ${actor.username}/${actor.name}`)\n console.log(` ${actor.title}`)\n if (actor.stats?.totalRuns) {\n console.log(` Runs: ${actor.stats.totalRuns}`)\n }\n })\n console.log()\n\n // Test 2: Verify types\n console.log('Test 2: Verifying TypeScript types...')\n const firstActor = actors[0]\n if (!firstActor.id || !firstActor.name || !firstActor.username) {\n console.error('❌ Actor object missing required fields')\n process.exit(1)\n }\n console.log('✅ Actor types correct')\n console.log()\n\n // Test 3: Test token estimation\n console.log('Test 3: Token estimation...')\n const estimateTokens = (data: any) => {\n return Math.ceil(JSON.stringify(data).length / 4)\n }\n\n const tokens = estimateTokens(actors)\n console.log(`✅ ${actors.length} actors = ~${tokens} tokens`)\n console.log()\n\n console.log('=== ALL TESTS PASSED ===\\n')\n console.log('✅ Apify code-first API is working correctly')\n console.log('✅ Ready to use for scraping operations')\n console.log('✅ Token savings will apply when filtering datasets\\n')\n\n } catch (error) {\n console.error('❌ Test failed:', error instanceof Error ? error.message : error)\n process.exit(1)\n }\n}\n\nif (import.meta.main) {\n main()\n}\n\nexport { main }\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":2302,"content_sha256":"c1166bf4540539ae8ec38870d70bb0fb70d919de971f2b46b2351790200a6050"},{"filename":"index.ts","content":"/**\n * Apify Code-First Interface\n *\n * Replaces token-heavy MCP calls with direct code execution.\n * Enables in-code filtering and control flow for massive token savings.\n */\n\nimport { ApifyClient } from 'apify-client'\n\nexport interface Actor {\n id: string\n name: string\n username: string\n title: string\n description?: string\n createdAt?: string\n modifiedAt?: string\n stats?: {\n totalRuns?: number\n lastRunStartedAt?: string\n }\n}\n\nexport interface ActorRun {\n id: string\n actorId: string\n status: 'READY' | 'RUNNING' | 'SUCCEEDED' | 'FAILED' | 'TIMED-OUT' | 'ABORTED'\n startedAt: string\n finishedAt?: string\n defaultDatasetId: string\n defaultKeyValueStoreId: string\n buildNumber?: string\n exitCode?: number\n containerUrl?: string\n output?: any\n}\n\nexport interface DatasetOptions {\n offset?: number\n limit?: number\n fields?: string[]\n omit?: string[]\n clean?: boolean\n}\n\n/**\n * Main Apify client for code-first operations\n */\nexport class Apify {\n private client: ApifyClient\n\n constructor(token?: string) {\n this.client = new ApifyClient({\n token: token || process.env.APIFY_TOKEN || process.env.APIFY_API_KEY\n })\n }\n\n /**\n * Search for actors by keyword\n *\n * Fetches actors and filters client-side by query (name, title, description).\n * For better performance with many actors, consider listing all and caching.\n *\n * @param query - Search query (actor name, description, etc.)\n * @param options - Search options\n * @returns Array of matching actors\n */\n async search(query: string, options?: {\n limit?: number\n offset?: number\n }): Promise\u003cActor[]> {\n // Fetch more actors than needed to ensure we get enough matches\n const fetchLimit = Math.max((options?.limit ?? 10) * 3, 30)\n\n const { items } = await this.client.actors().list({\n limit: fetchLimit,\n offset: options?.offset ?? 0\n })\n\n // Filter client-side by query\n // Match if ANY word in query appears in actor fields\n const queryWords = query.toLowerCase().split(/\\s+/)\n const filtered = items.filter((actor: any) => {\n const name = (actor.name || '').toLowerCase()\n const title = (actor.title || '').toLowerCase()\n const description = (actor.description || '').toLowerCase()\n const username = (actor.username || '').toLowerCase()\n const searchText = `${name} ${title} ${description} ${username}`\n\n // Match if any query word is found\n return queryWords.some(word => searchText.includes(word))\n })\n\n // Return requested number of matches\n return filtered.slice(0, options?.limit ?? 10) as Actor[]\n }\n\n /**\n * Call (execute) an actor\n *\n * @param actorId - Actor ID or \"username/actor-name\"\n * @param input - Actor input configuration\n * @param options - Runtime options (memory, timeout)\n * @returns Actor run information\n */\n async callActor(\n actorId: string,\n input: any,\n options?: {\n memory?: number // Memory in MB (128, 256, 512, 1024, etc.)\n timeout?: number // Timeout in seconds\n build?: string // Build number or tag\n }\n ): Promise\u003cActorRun> {\n const run = await this.client.actor(actorId).call(input, {\n memory: options?.memory,\n timeout: options?.timeout,\n build: options?.build\n })\n\n return run as ActorRun\n }\n\n /**\n * Get dataset interface for reading and filtering data\n *\n * @param datasetId - Dataset ID from actor run\n * @returns ApifyDataset instance\n */\n getDataset(datasetId: string): ApifyDataset {\n return new ApifyDataset(this.client, datasetId)\n }\n\n /**\n * Get actor run status\n *\n * @param runId - Run ID\n * @returns Run information\n */\n async getRun(runId: string): Promise\u003cActorRun> {\n const run = await this.client.run(runId).get()\n return run as ActorRun\n }\n\n /**\n * Wait for actor run to finish\n *\n * @param runId - Run ID\n * @param options - Wait options\n * @returns Final run information\n */\n async waitForRun(\n runId: string,\n options?: {\n waitSecs?: number\n }\n ): Promise\u003cActorRun> {\n const run = await this.client.run(runId).waitForFinish({\n waitSecs: options?.waitSecs\n })\n return run as ActorRun\n }\n}\n\n/**\n * Dataset interface for reading and filtering data\n *\n * KEY FEATURE: Filter data in code BEFORE returning to model context\n * This is where the massive token savings happen!\n */\nexport class ApifyDataset {\n constructor(\n private client: ApifyClient,\n private datasetId: string\n ) {}\n\n /**\n * List dataset items\n *\n * @param options - List options (pagination, fields)\n * @returns Array of dataset items\n */\n async listItems(options?: DatasetOptions): Promise\u003cany[]> {\n const { items } = await this.client.dataset(this.datasetId).listItems({\n offset: options?.offset,\n limit: options?.limit,\n fields: options?.fields,\n omit: options?.omit,\n clean: options?.clean\n })\n\n return items\n }\n\n /**\n * Get all dataset items (handles pagination automatically)\n *\n * WARNING: For large datasets, use listItems with limit\n * or filter in code to avoid excessive tokens\n *\n * @returns Array of all items\n */\n async getAllItems(): Promise\u003cany[]> {\n const allItems: any[] = []\n let offset = 0\n const limit = 1000\n\n while (true) {\n const { items, count, total } = await this.client.dataset(this.datasetId).listItems({\n offset,\n limit\n })\n\n allItems.push(...items)\n\n if (offset + count >= total) break\n offset += limit\n }\n\n return allItems\n }\n\n /**\n * Helper: Filter items by predicate function\n *\n * This is a convenience method - you can also filter\n * using standard array methods after listItems()\n *\n * @param predicate - Filter function\n * @returns Filtered items\n */\n async filter(predicate: (item: any) => boolean): Promise\u003cany[]> {\n const items = await this.getAllItems()\n return items.filter(predicate)\n }\n\n /**\n * Helper: Get top N items by sort function\n *\n * @param sortFn - Sort comparison function\n * @param limit - Number of items to return\n * @returns Top N sorted items\n */\n async top(sortFn: (a: any, b: any) => number, limit: number): Promise\u003cany[]> {\n const items = await this.getAllItems()\n return items.sort(sortFn).slice(0, limit)\n }\n}\n\n// Re-export for convenience\nexport { ApifyClient }\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":6388,"content_sha256":"7a9c8daafce3cfc8e7dc3ee734185c0e8d98d49106632adc01174167f20b9fca"},{"filename":"INTEGRATION.md","content":"# Apify Integration Guide\n\n**Status:** Production Ready ✅\n**Token Savings:** 90-98% vs traditional MCP approach\n**Execution Time:** ~10 seconds typical\n\n## Integration with PAI Skills\n\n### Social Skill Integration\n\n\n**Updated Section:** \"Fetching Tweet Content\"\n\nThe social skill now uses code-based Apify scripts instead of `mcp__apify` MCP tool.\n\n**Trigger → Script Mapping:**\n\n| User Says | Script to Run |\n|-----------|---------------|\n| \"my latest tweet\" | `get-latest-tweet.ts` |\n| \"my latest thread\" | `get-latest-thread.ts` |\n| \"get tweets from @user\" | `get-user-tweets.ts user 5` |\n| \"what has @user been talking about\" | `get-user-tweets.ts user 10` |\n\n**Example Workflow:**\n\n1. User: \"Turn my latest tweet into a LinkedIn post\"\n2. System runs: `bun ~/.claude/filesystem-mcps/apify/get-latest-tweet.ts`\n3. Script returns: Tweet text + metadata (~500 tokens)\n4. System transforms tweet into LinkedIn format\n5. **Token savings: 98%** (vs fetching unfiltered profile data)\n\n### Research Skill Integration\n\n**Use Case:** Monitor influential developers' Twitter activity\n\n```bash\n# Research what ThePrimeagen is discussing\nbun ~/.claude/filesystem-mcps/apify/get-user-tweets.ts ThePrimeagen 10\n\n# Analyze Paul Graham's recent thoughts\nbun ~/.claude/filesystem-mcps/apify/get-user-tweets.ts paulg 20\n\n# Track Simon Willison's posts\nbun ~/.claude/filesystem-mcps/apify/get-user-tweets.ts simonw 15\n```\n\n**Token Efficiency:**\n- 10 tweets unfiltered: ~80,000 tokens\n- 10 tweets filtered: ~8,000 tokens\n- **Savings: 90%**\n\n### Writing Skill Integration\n\n**Use Case:** Generate blog content from Twitter discussions\n\n```bash\n# Get user's thread about AI topic\nbun ~/.claude/filesystem-mcps/apify/get-latest-thread.ts\n\n# Expand thread into blog post format\n# Token efficient: only thread content in context\n```\n\n## Available Scripts Summary\n\n### 1. get-latest-tweet.ts\n**Purpose:** User's most recent single tweet\n**Usage:** `bun get-latest-tweet.ts`\n**Returns:** Text, date, URL, engagement stats\n**Tokens:** ~500\n\n### 2. get-latest-thread.ts\n**Purpose:** User's most recent Twitter thread\n**Usage:** `bun get-latest-thread.ts`\n**Returns:** All thread tweets chronologically\n**Tokens:** ~5,500 (for 5-tweet thread)\n**Savings:** 87-90% vs unfiltered\n\n### 3. get-user-tweets.ts\n**Purpose:** Any user's recent tweets\n**Usage:** `bun get-user-tweets.ts \u003cusername> \u003climit>`\n**Returns:** Recent tweets with metadata\n**Tokens:** ~800 per tweet\n**Savings:** 90-95% vs unfiltered\n\n### 4. debug-tweet-structure.ts\n**Purpose:** Inspect raw API response\n**Usage:** `bun debug-tweet-structure.ts`\n**Returns:** Full JSON structure + available fields\n**Use:** Development/debugging only\n\n## Migration from MCP\n\n### Before (MCP Approach)\n\n```typescript\n// Step 1: Search for actors (~1,000 tokens)\nmcp__Apify__search-actors(\"twitter scraper\")\n\n// Step 2: Call actor (~1,000 tokens)\nmcp__Apify__call-actor(actorId, input)\n\n// Step 3: Get output (~50,000 tokens unfiltered!)\nmcp__Apify__get-actor-output(runId)\n\n// Total: ~57,000 tokens\n```\n\n### After (Code-Based Approach)\n\n```typescript\n// All in one script, filtering in code\nbun ~/.claude/filesystem-mcps/apify/get-latest-tweet.ts\n\n// Returns only filtered result: ~500 tokens\n// Savings: 98.2%\n```\n\n## Best Practices\n\n### DO:\n✅ Use appropriate script for the task\n✅ Let script filter data before returning\n✅ Trust token savings calculations\n✅ Run from `~/.claude/filesystem-mcps/apify/` directory or use full path\n✅ Check execution time (~10 seconds expected)\n\n### DON'T:\n❌ Fall back to MCP tools for Twitter operations\n❌ Fetch unfiltered data into model context\n❌ Re-implement filtering logic (use existing scripts)\n❌ Skip error handling (scripts handle common errors)\n❌ Ignore token savings metrics in output\n\n## Performance Expectations\n\n**Execution Time:**\n- Actor search: Eliminated (hardcoded actor ID)\n- Actor execution: ~10 seconds (Apify platform time)\n- Data processing: \u003c1 second (TypeScript filtering)\n- **Total: ~10 seconds**\n\n**Token Usage:**\n- Single tweet: 500 tokens (vs 57,000 MCP)\n- Thread (5 tweets): 5,500 tokens (vs 60,000 unfiltered)\n- User tweets (10): 8,000 tokens (vs 80,000 unfiltered)\n\n**Rate Limits:**\n- Apify free tier: 100 actor runs/day\n- Apify paid tier: Unlimited\n- Current usage: Well within limits\n\n## Error Handling\n\nScripts handle common errors automatically:\n\n1. **Missing APIFY_TOKEN** → Clear error message with setup instructions\n2. **Actor failure** → Reports status and exits cleanly\n3. **No results** → Graceful message, no crash\n4. **Network timeout** → Configurable timeout (120s default)\n\n**Manual intervention rarely needed.**\n\n## Future Enhancements\n\n### Planned Features:\n\n1. **Search tweets by topic**\n - `search-tweets.ts \u003cusername> \u003cquery> \u003climit>`\n - Example: Search user's tweets about \"AI\" from last month\n\n2. **Thread detection improvements**\n - Better handling of quote tweets\n - Reply chain analysis\n - Thread continuity verification\n\n3. **Engagement analytics**\n - Filter by minimum engagement threshold\n - Sort by engagement metrics\n - Engagement trend analysis\n\n4. **Export formats**\n - JSON output for programmatic use\n - Markdown format for documentation\n - CSV for spreadsheet analysis\n\n### Migration Candidates:\n\nOther Apify actors worth implementing:\n- Instagram scraping\n- LinkedIn scraping\n- YouTube data extraction\n- Generic web scraping\n\n**Same pattern applies:** Filter in code, 90%+ token savings expected.\n\n## Documentation\n\n**For Users:**\n- Quick reference: `~/.claude/`\n- Social skill: `~/.claude/`\n\n**For Developers:**\n- Implementation: `~/.claude/`\n- Standards: `~/.claude/`\n- Parent guide: `~/.claude/`\n\n## Support\n\n**Common Questions:**\n\nQ: Why not use MCP?\nA: 90-98% token savings, faster execution, better control.\n\nQ: What if script fails?\nA: Check `APIFY_TOKEN` in `${PAI_DIR}/.env`, verify network, check Apify status.\n\nQ: Can I add new actors?\nA: Yes! Follow `STANDARDS.md` pattern, hardcode actor ID, filter in code.\n\nQ: How do I debug?\nA: Use `debug-tweet-structure.ts` to inspect raw data, check console output.\n\n## Success Metrics\n\n**Achieved:**\n- ✅ 90-98% token reduction vs MCP\n- ✅ ~10 second execution time\n- ✅ Production integration in social skill\n- ✅ 4 production-ready scripts\n- ✅ Comprehensive documentation\n\n**This is now the standard for all Twitter operations in PAI.**\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":6373,"content_sha256":"90ff4dfc1d1cc305daa3e87693a1842c96c289762dca5bafb9969a1ce5827e16"},{"filename":"package.json","content":"{\n \"name\": \"@pai/filesystem-mcp-apify\",\n \"version\": \"1.0.0\",\n \"type\": \"module\",\n \"description\": \"Code-first Apify interface for PAI - replaces token-heavy MCP calls\",\n \"main\": \"index.ts\",\n \"scripts\": {\n \"example\": \"bun run examples/instagram-scraper.ts\"\n },\n \"dependencies\": {\n \"apify-client\": \"^2.22.3\"\n },\n \"devDependencies\": {\n \"@types/bun\": \"latest\",\n \"typescript\": \"^5.0.0\"\n }\n}\n","content_type":"application/json; charset=utf-8","language":"json","size":407,"content_sha256":"0bda74058ca6f67acfe94a69f59a637c84c262051822c49040da35a209e4b347"},{"filename":"README.md","content":"# Apify Code-First API\n\n**Code-based replacement for token-heavy Apify MCP calls.**\n\nProgressive disclosure interface for web scraping and automation via the Apify platform. Filter data in code before returning to model context for massive token savings.\n\n## Quick Start\n\n```typescript\nimport { Apify } from '~/.claude/filesystem-mcps/apify'\n\nconst apify = new Apify(process.env.APIFY_TOKEN)\n\n// Search for actors\nconst actors = await apify.search(\"instagram scraper\")\n\n// Call an actor\nconst run = await apify.callActor(actors[0].id, {\n profiles: [\"target\"],\n resultsLimit: 100\n})\n\n// Get and filter results IN CODE (key to token savings!)\nconst dataset = await apify.getDataset(run.defaultDatasetId)\nconst items = await dataset.listItems()\n\n// Only filtered results reach model context\nconst relevant = items\n .filter(item => item.likesCount > 1000)\n .filter(item => item.timestamp > Date.now() - 86400000)\n .slice(0, 10)\n\nconsole.log(relevant) // Only 10 items vs 100+ unfiltered\n```\n\n## Why Code-First?\n\n**Token Comparison:**\n\n**MCP Approach** (~57,000 tokens):\n```\n1. mcp__Apify__search-actors → 1,000 tokens result\n2. mcp__Apify__call-actor → 1,000 tokens result\n3. mcp__Apify__get-actor-output → 50,000 tokens unfiltered dataset\n```\n\n**Code-First** (~1,000 tokens - 98.2% reduction):\n```typescript\n// All operations in code, filter before returning\nconst filtered = items.filter(...).slice(0, 10)\n// Only 10 filtered items (500 tokens) reach model\n```\n\n## Core API\n\n### Apify Class\n\nMain client for interacting with Apify platform.\n\n**Constructor:**\n```typescript\nnew Apify(token?: string)\n```\n- `token` - Apify API token (defaults to `process.env.APIFY_TOKEN`)\n\n**Methods:**\n\n#### `search(query, options?)`\nSearch for actors by keyword.\n\n```typescript\nconst actors = await apify.search(\"instagram scraper\", {\n limit: 10,\n offset: 0\n})\n```\n\n**Parameters:**\n- `query` - Search keywords\n- `options.limit` - Max results (default: 10)\n- `options.offset` - Skip results (default: 0)\n\n**Returns:** Array of Actor objects with id, name, title, description, stats\n\n#### `callActor(actorId, input, options?)`\nExecute an actor.\n\n```typescript\nconst run = await apify.callActor(\"apify/instagram-scraper\", {\n profiles: [\"target\"],\n resultsLimit: 100\n}, {\n memory: 2048,\n timeout: 300\n})\n```\n\n**Parameters:**\n- `actorId` - Actor ID or \"username/actor-name\"\n- `input` - Actor-specific input configuration\n- `options.memory` - Memory in MB (128, 256, 512, 1024, 2048, etc.)\n- `options.timeout` - Timeout in seconds\n- `options.build` - Build number or tag\n\n**Returns:** ActorRun object with run details and `defaultDatasetId`\n\n#### `getDataset(datasetId)`\nGet dataset interface for reading/filtering data.\n\n```typescript\nconst dataset = await apify.getDataset(run.defaultDatasetId)\n```\n\n**Returns:** ApifyDataset instance\n\n#### `getRun(actorId, runId)`\nGet run status.\n\n```typescript\nconst run = await apify.getRun(actorId, runId)\n```\n\n**Returns:** ActorRun object with current status\n\n#### `waitForRun(actorId, runId, options?)`\nWait for run to finish.\n\n```typescript\nconst finalRun = await apify.waitForRun(actorId, runId, {\n waitSecs: 120\n})\n```\n\n**Returns:** Final ActorRun object when complete\n\n### ApifyDataset Class\n\nInterface for reading and filtering dataset results.\n\n**Key Concept:** Filter in code BEFORE returning to model context!\n\n**Methods:**\n\n#### `listItems(options?)`\nList dataset items with pagination.\n\n```typescript\nconst items = await dataset.listItems({\n offset: 0,\n limit: 100,\n fields: ['username', 'likesCount', 'text']\n})\n```\n\n**Parameters:**\n- `options.offset` - Skip items\n- `options.limit` - Max items\n- `options.fields` - Include only these fields\n- `options.omit` - Exclude these fields\n- `options.clean` - Clean HTML/special chars\n\n**Returns:** Array of dataset items\n\n#### `getAllItems()`\nGet all items (handles pagination automatically).\n\n**Warning:** For large datasets, use `listItems()` with limit or filter in code.\n\n```typescript\nconst allItems = await dataset.getAllItems()\nconst filtered = allItems.filter(item => item.score > 0.8)\n```\n\n**Returns:** Array of all dataset items\n\n#### `filter(predicate)`\nHelper to filter items by predicate.\n\n```typescript\nconst relevant = await dataset.filter(item =>\n item.likesCount > 1000 &&\n item.timestamp > Date.now() - 86400000\n)\n```\n\n**Parameters:**\n- `predicate` - Filter function `(item) => boolean`\n\n**Returns:** Filtered items array\n\n#### `top(sortFn, limit)`\nHelper to get top N items by sort function.\n\n```typescript\nconst topPosts = await dataset.top(\n (a, b) => b.likesCount - a.likesCount,\n 10\n)\n```\n\n**Parameters:**\n- `sortFn` - Sort comparison function\n- `limit` - Number of items to return\n\n**Returns:** Top N sorted items\n\n## Common Patterns\n\n### Pattern 1: Search → Call → Filter Results\n\n```typescript\n// Find actor\nconst actors = await apify.search(\"web scraper\")\nconst actor = actors[0]\n\n// Execute actor\nconst run = await apify.callActor(actor.id, {\n startUrls: [\"https://example.com\"],\n maxPages: 50\n})\n\n// Wait for completion\nawait apify.waitForRun(actor.id, run.id)\n\n// Get and filter results\nconst dataset = apify.getDataset(run.defaultDatasetId)\nconst items = await dataset.listItems({ limit: 100 })\n\n// Filter in code - only relevant items reach model\nconst relevant = items\n .filter(item => item.price \u003c 100)\n .filter(item => item.inStock)\n .slice(0, 10)\n```\n\n### Pattern 2: Process Large Dataset in Chunks\n\n```typescript\nconst dataset = apify.getDataset(datasetId)\n\n// Process in batches to avoid memory issues\nlet offset = 0\nconst limit = 1000\nconst results = []\n\nwhile (true) {\n const batch = await dataset.listItems({ offset, limit })\n if (batch.length === 0) break\n\n // Filter each batch\n const filtered = batch.filter(item => item.relevant === true)\n results.push(...filtered)\n\n offset += limit\n}\n\n// Only filtered results go to model context\nconsole.log(results)\n```\n\n### Pattern 3: Get Top Performers\n\n```typescript\nconst dataset = apify.getDataset(datasetId)\n\n// Get top 10 posts by engagement\nconst topPosts = await dataset.top(\n (a, b) => b.likesCount - a.likesCount,\n 10\n)\n\n// Only top 10 items (not entire dataset) reach model\nconsole.log(topPosts)\n```\n\n## Environment Variables\n\n```bash\n# Required\nAPIFY_TOKEN=apify_api_xxxxx...\n\n# Optional (uses defaults if not set)\nAPIFY_API_BASE_URL=https://api.apify.com/v2\n```\n\nGet your token from: https://console.apify.com/account/integrations\n\n## TypeScript Types\n\nAll types are exported from the main module:\n\n```typescript\nimport { Actor, ActorRun, DatasetOptions } from '~/.claude/filesystem-mcps/apify'\n```\n\n## Error Handling\n\n```typescript\ntry {\n const run = await apify.callActor(actorId, input)\n await apify.waitForRun(actorId, run.id)\n\n const finalRun = await apify.getRun(actorId, run.id)\n\n if (finalRun.status !== 'SUCCEEDED') {\n console.error('Actor run failed:', finalRun.status)\n return\n }\n\n // Process results...\n} catch (error) {\n console.error('Apify error:', error.message)\n}\n```\n\n## Running Examples\n\n```bash\n# Run the Instagram scraper example\ncd ~/.claude/filesystem-mcps/apify\nbun run examples/instagram-scraper.ts\n\n# Or use bun directly\nbun examples/instagram-scraper.ts\n```\n\n## Token Savings Calculator\n\nEstimate your token savings:\n\n```typescript\nfunction estimateTokens(data: any): number {\n const str = JSON.stringify(data)\n return Math.ceil(str.length / 4) // ~4 chars per token\n}\n\n// Before (MCP)\nconst allItems = await dataset.getAllItems() // 10,000 items\nconsole.log('MCP tokens:', estimateTokens(allItems)) // ~50,000\n\n// After (Code-First)\nconst filtered = allItems.filter(...).slice(0, 10) // 10 items\nconsole.log('Code tokens:', estimateTokens(filtered)) // ~500\n\n// Savings: 99% token reduction!\n```\n\n## When to Use Code-First vs MCP\n\n**Use Code-First (this API):**\n- ✅ Need to filter/transform large datasets\n- ✅ Processing 100+ results and want top 10\n- ✅ Multiple operations in sequence (search → call → filter)\n- ✅ Control flow (loops, conditionals)\n- ✅ Privacy-sensitive data that shouldn't enter model context\n\n**Use MCP:**\n- ❌ Simple single operations with small results\n- ❌ Need to expose to non-code-capable models\n- ❌ Provider-specific features not in this wrapper\n\n## Links\n\n- Apify Platform: https://apify.com\n- Apify Console: https://console.apify.com\n- Actor Store: https://apify.com/store\n- API Docs: https://docs.apify.com/api/v2\n- Parent README: `~/.claude/`\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":8445,"content_sha256":"52bcefbef5fdc8d1f5ec4f034884f086bb04a9e203ef51da76e2e2722a33f1ea"},{"filename":"skills/get-user-tweets.ts","content":"#!/usr/bin/env bun\n\n/**\n * Get latest tweets from any Twitter user using code-first Apify\n */\n\nimport { Apify } from '../index'\n\nasync function main() {\n const username = process.argv[2]\n const limit = parseInt(process.argv[3] || '5')\n\n if (!username) {\n console.error('Usage: bun get-user-tweets.ts \u003cusername> [limit]')\n console.error('Example: bun get-user-tweets.ts ThePrimeagen 5')\n process.exit(1)\n }\n\n console.log(`=== Getting Latest ${limit} Tweets from @${username} ===\\n`)\n\n const apify = new Apify()\n\n try {\n // Use known working actor: apidojo/twitter-scraper-lite\n const TWITTER_ACTOR_ID = 'apidojo/twitter-scraper-lite'\n\n console.log(`1. Scraping @${username} profile...`)\n\n const input = {\n username,\n max_posts: limit,\n maxTweets: limit,\n maxItems: limit,\n resultsLimit: limit,\n tweetsDesired: limit,\n searchTerms: [`from:${username}`],\n startUrls: [`https://twitter.com/${username}`]\n }\n\n console.log(` Fetching last ${limit} tweets...`)\n console.log(' (this may take 30-60 seconds)...')\n\n const run = await apify.callActor(TWITTER_ACTOR_ID, input, {\n memory: 2048,\n timeout: 120\n })\n\n console.log(` Run ID: ${run.id}`)\n console.log()\n\n // Step 2: Wait for completion\n console.log('2. Waiting for scraper to finish...')\n await apify.waitForRun(run.id, { waitSecs: 120 })\n\n const finalRun = await apify.getRun(run.id)\n console.log(` Status: ${finalRun.status}`)\n\n if (finalRun.status !== 'SUCCEEDED') {\n console.error(' Actor run did not succeed!')\n console.error(' Status:', finalRun.status)\n process.exit(1)\n }\n console.log()\n\n // Step 3: Get results\n console.log('3. Fetching results...')\n const dataset = apify.getDataset(finalRun.defaultDatasetId)\n const items = await dataset.listItems({ limit })\n\n console.log(` Retrieved ${items.length} tweets`)\n console.log()\n\n if (items.length === 0) {\n console.log(' No tweets found.')\n return\n }\n\n // Step 4: Show the tweets\n console.log('4. Latest tweets:')\n console.log(' ════════════════════════════════════════')\n console.log()\n\n items.forEach((tweet, i) => {\n console.log(` ${i + 1}/${items.length}:`)\n console.log(` ${tweet.text || tweet.fullText}`)\n console.log()\n console.log(` Posted: ${tweet.createdAt}`)\n if (tweet.url) {\n console.log(` URL: ${tweet.url}`)\n }\n console.log(' ────────────────────────────────────────')\n console.log()\n })\n\n // Step 5: Show token savings\n const estimateTokens = (data: any) => {\n return Math.ceil(JSON.stringify(data).length / 4)\n }\n\n const totalTokens = estimateTokens(items)\n console.log('5. Token efficiency:')\n console.log(` ${items.length} tweets: ~${totalTokens} tokens`)\n console.log(` Filtered in code before model context`)\n console.log()\n\n console.log('✅ Successfully retrieved tweets using code-first Apify!')\n\n } catch (error) {\n console.error('❌ Error:', error instanceof Error ? error.message : error)\n if (error instanceof Error && error.stack) {\n console.error('\\nStack:', error.stack)\n }\n process.exit(1)\n }\n}\n\nif (import.meta.main) {\n main()\n}\n\nexport { main }\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":3467,"content_sha256":"ecd7105bee66f68bf6cf8e1d7dbf229c3d31d1af81ec368807095b9a92236fd4"},{"filename":"tsconfig.json","content":"{\n \"compilerOptions\": {\n \"lib\": [\"ESNext\"],\n \"module\": \"esnext\",\n \"target\": \"esnext\",\n \"moduleResolution\": \"bundler\",\n \"moduleDetection\": \"force\",\n \"allowImportingTsExtensions\": true,\n \"noEmit\": true,\n \"composite\": true,\n \"strict\": true,\n \"downlevelIteration\": true,\n \"skipLibCheck\": true,\n \"jsx\": \"react-jsx\",\n \"allowSyntheticDefaultImports\": true,\n \"forceConsistentCasingInFileNames\": true,\n \"allowJs\": true,\n \"types\": [\n \"bun-types\"\n ]\n }\n}\n","content_type":"application/json; charset=utf-8","language":"json","size":502,"content_sha256":"3c3c1714e30747b6e0ff6122f0abd0691beeea4c09c21e6a90fa306758d36222"},{"filename":"types/common.ts","content":"/**\n * Common types shared across all Apify actors\n */\n\n/**\n * Standard pagination options for all scrapers\n */\nexport interface PaginationOptions {\n /** Maximum number of results to return */\n maxResults?: number\n /** Skip first N results */\n offset?: number\n}\n\n/**\n * Date range filter options\n */\nexport interface DateRangeOptions {\n /** Start date (ISO string or Date object) */\n from?: string | Date\n /** End date (ISO string or Date object) */\n to?: string | Date\n}\n\n/**\n * Engagement metrics common to social media posts\n */\nexport interface EngagementMetrics {\n likesCount?: number\n commentsCount?: number\n sharesCount?: number\n viewsCount?: number\n}\n\n/**\n * Standard user/profile information\n */\nexport interface UserProfile {\n id?: string\n username?: string\n fullName?: string\n bio?: string\n profilePictureUrl?: string\n followersCount?: number\n followingCount?: number\n verified?: boolean\n}\n\n/**\n * Standard post/content structure\n */\nexport interface Post extends EngagementMetrics {\n id: string\n url: string\n text?: string\n caption?: string\n timestamp: string\n author?: UserProfile\n imageUrls?: string[]\n videoUrl?: string\n hashtags?: string[]\n mentions?: string[]\n}\n\n/**\n * Geo-location data\n */\nexport interface Location {\n latitude?: number\n longitude?: number\n address?: string\n city?: string\n state?: string\n country?: string\n postalCode?: string\n}\n\n/**\n * Contact information\n */\nexport interface ContactInfo {\n email?: string\n phone?: string\n website?: string\n socialMedia?: {\n facebook?: string\n twitter?: string\n instagram?: string\n linkedin?: string\n youtube?: string\n }\n}\n\n/**\n * Business/place information\n */\nexport interface BusinessInfo {\n name: string\n category?: string\n rating?: number\n reviewsCount?: number\n priceLevel?: number\n location?: Location\n contact?: ContactInfo\n openingHours?: string[]\n isOpen?: boolean\n}\n\n/**\n * Actor run options for controlling execution\n */\nexport interface ActorRunOptions {\n /** Memory allocation in MB (128, 256, 512, 1024, 2048, 4096, 8192) */\n memory?: number\n /** Timeout in seconds */\n timeout?: number\n /** Build tag or number to use */\n build?: string\n}\n\n/**\n * Error result when actor fails\n */\nexport interface ActorError {\n message: string\n actorId: string\n runId?: string\n status?: string\n}\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":2344,"content_sha256":"109258ba3dcc6c380e4a071faeedafd3489bbf0ae23005dac937f4360114ede5"},{"filename":"types/index.ts","content":"/**\n * Type definitions for Apify actors\n */\n\nexport * from './common'\n","content_type":"text/typescript; charset=utf-8","language":"typescript","size":71,"content_sha256":"751b938cb3e1ffbfb79dfb22642672f4099a55e889adfdf3095e32f12da0dd1c"},{"filename":"Workflows/Update.md","content":"# Update Workflow\n\nCheck Apify API and actor ecosystem for updates.\n\n## Voice Notification\n\n```bash\ncurl -s -X POST http://localhost:31337/notify \\\n -H \"Content-Type: application/json\" \\\n -d '{\"message\": \"Running the Update workflow in the Apify skill to check updates\"}' \\\n > /dev/null 2>&1 &\n```\n\nRunning **Update** in **Apify**...\n\n---\n\n## When to Use\n\n- Monthly capability check\n- After Apify announces new features\n- If actor calls fail unexpectedly\n- When new popular actors become available\n\n## Official Source\n\n**API Docs:** https://docs.apify.com/api/v2\n**Changelog:** https://docs.apify.com/api/v2/changelog\n**Actor Store:** https://apify.com/store\n\n## Steps\n\n### 1. Check API Changelog\n\n```bash\nopen https://docs.apify.com/api/v2/changelog\n```\n\nReview for:\n- New endpoints\n- Breaking changes\n- Deprecated features\n- Rate limit changes\n\n### 2. Check Popular Actors\n\nReview commonly used actors for updates:\n\n| Actor | Purpose | Check For |\n|-------|---------|-----------|\n| apify/instagram-scraper | Instagram posts/profiles | Schema changes |\n| apify/twitter-scraper | Twitter/X data | API changes |\n| apify/google-maps-scraper | Business data | New fields |\n| apify/web-scraper | General scraping | New options |\n\n### 3. Test Current Implementation\n\n```bash\n# Verify API wrapper works\nbun run ../scrape-instagram.ts --help 2>/dev/null || echo \"Check script\"\n```\n\n### 4. Update Implementation\n\nIf new critical functionality found:\n1. Update `index.ts` API wrapper\n2. Add new actor scripts to skill\n3. Update type definitions\n4. Update SKILL.md documentation\n\n### 5. Update Actor Registry\n\nMaintain list of tested actors:\n\n| Actor | Last Tested | Status |\n|-------|-------------|--------|\n| instagram-scraper | 2026-01 | Working |\n| twitter-scraper | 2026-01 | Working |\n| google-maps | 2026-01 | Working |\n\n## Version Tracking\n\n```\n# Last sync: 2026-01-03\n# Apify API: v2\n# Tested actors: 10+\n# Known issues: None\n```\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":1932,"content_sha256":"2d7bfc8a63f3967c0c958881b7778fe693ea52be858f08f7f8c50b4bbc9523c0"}],"content_json":{"type":"doc","content":[{"type":"heading","attrs":{"level":2},"content":[{"text":"Customization","type":"text"}]},{"type":"paragraph","content":[{"text":"Before executing, check for user customizations at:","type":"text","marks":[{"type":"strong"}]},{"text":" ","type":"text"},{"text":"~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/Apify/","type":"text","marks":[{"type":"code_inline"}]}]},{"type":"paragraph","content":[{"text":"If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)","type":"text"}]},{"type":"paragraph","content":[{"text":"You MUST send this notification BEFORE doing anything else when this skill is invoked.","type":"text","marks":[{"type":"strong"}]}]},{"type":"ordered_list","attrs":{"order":1,"listStyle":"number"},"content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Send voice notification","type":"text","marks":[{"type":"strong"}]},{"text":":","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"curl -s -X POST http://localhost:31337/notify \\\n -H \"Content-Type: application/json\" \\\n -d '{\"message\": \"Running the WORKFLOWNAME workflow in the Apify skill to ACTION\"}' \\\n > /dev/null 2>&1 &","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Output text notification","type":"text","marks":[{"type":"strong"}]},{"text":":","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":""},"content":[{"text":"Running the **WorkflowName** workflow in the **Apify** skill to ACTION...","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"This is not optional. Execute this curl command immediately upon skill invocation.","type":"text","marks":[{"type":"strong"}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Apify - Social Media & Web Scraping","type":"text"}]},{"type":"paragraph","content":[{"text":"Direct TypeScript access to 9 popular Apify actors with 99% token savings.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🔌 File-Based MCP","type":"text"}]},{"type":"paragraph","content":[{"text":"This skill is a ","type":"text"},{"text":"file-based MCP","type":"text","marks":[{"type":"strong"}]},{"text":" - a code-first API wrapper that replaces token-heavy MCP protocol calls.","type":"text"}]},{"type":"paragraph","content":[{"text":"Why file-based?","type":"text","marks":[{"type":"strong"}]},{"text":" Filter data in code BEFORE returning to model context = 97.5% token savings.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🎯 Overview","type":"text"}]},{"type":"paragraph","content":[{"text":"Direct TypeScript access to the 9 most popular Apify actors without MCP overhead. Filter and transform data in code BEFORE it reaches the model context.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"📊 Available Actors","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Social Media (5 platforms)","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Instagram","type":"text","marks":[{"type":"strong"}]},{"text":" (145k users, 4.60★) - Profiles, posts, hashtags, comments","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"LinkedIn","type":"text","marks":[{"type":"strong"}]},{"text":" (26k users, 4.10★) - Profiles, jobs, posts","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"TikTok","type":"text","marks":[{"type":"strong"}]},{"text":" (90k users, 4.61★) - Profiles, videos, hashtags, comments","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"YouTube","type":"text","marks":[{"type":"strong"}]},{"text":" (40k users, 4.40★) - Channels, videos, comments, search","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Facebook","type":"text","marks":[{"type":"strong"}]},{"text":" (35k users, 4.56★) - Posts, groups, comments","type":"text"}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Business & Lead Generation","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Google Maps","type":"text","marks":[{"type":"strong"}]},{"text":" (198k users, 4.76★) - ","type":"text"},{"text":"HIGHEST VALUE!","type":"text","marks":[{"type":"strong"}]}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Search businesses, extract contacts, reviews, images","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Perfect for lead generation","type":"text"}]}]}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"E-commerce","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Amazon","type":"text","marks":[{"type":"strong"}]},{"text":" (8k users, 4.97★) - Products, reviews, pricing","type":"text"}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Web Scraping","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Web Scraper","type":"text","marks":[{"type":"strong"}]},{"text":" (94k users, 4.39★) - General-purpose, works with ANY website","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🚀 Quick Start","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Basic Usage Pattern","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { scrapeInstagramProfile, searchGoogleMaps } from 'actors'\n\n// 1. Call the actor wrapper\nconst profile = await scrapeInstagramProfile({\n username: 'target_username',\n maxPosts: 50\n})\n\n// 2. Filter in code - BEFORE data reaches model!\nconst viral = profile.latestPosts?.filter(p => p.likesCount > 10000)\n\n// 3. Only filtered results reach model context\nconsole.log(viral) // ~10 posts instead of 50","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"📚 Examples by Use Case","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Social Media Monitoring","type":"text"}]},{"type":"paragraph","content":[{"text":"Instagram - Track engagement:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { scrapeInstagramProfile, scrapeInstagramPosts } from 'actors'\n\n// Get profile with recent posts\nconst profile = await scrapeInstagramProfile({\n username: 'competitor',\n maxPosts: 100\n})\n\n// Filter in code - only high-performing posts from last 30 days\nconst thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)\nconst topRecent = profile.latestPosts\n ?.filter(p =>\n new Date(p.timestamp).getTime() > thirtyDaysAgo &&\n p.likesCount > 5000\n )\n .sort((a, b) => b.likesCount - a.likesCount)\n .slice(0, 10)\n\n// Only 10 posts reach model instead of 100!","type":"text"}]},{"type":"paragraph","content":[{"text":"LinkedIn - Job search:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { searchLinkedInJobs } from 'actors'\n\nconst jobs = await searchLinkedInJobs({\n keywords: 'AI engineer',\n location: 'San Francisco',\n remote: true,\n maxResults: 200\n})\n\n// Filter in code - only senior roles at well-funded startups\nconst topJobs = jobs.filter(j =>\n j.seniority?.includes('Senior') &&\n parseInt(j.applicants || '0') > 50\n)","type":"text"}]},{"type":"paragraph","content":[{"text":"TikTok - Trend analysis:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { scrapeTikTokHashtag } from 'actors'\n\nconst videos = await scrapeTikTokHashtag({\n hashtag: 'ai',\n maxResults: 500\n})\n\n// Filter in code - only viral content\nconst viral = videos\n .filter(v => v.playCount > 1000000)\n .sort((a, b) => b.playCount - a.playCount)\n .slice(0, 20)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Lead Generation (Business Intelligence)","type":"text"}]},{"type":"paragraph","content":[{"text":"Google Maps - Local business leads:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { searchGoogleMaps } from 'actors'\n\n// Search with contact info extraction\nconst places = await searchGoogleMaps({\n query: 'restaurants in Austin',\n maxResults: 500,\n includeReviews: true,\n maxReviewsPerPlace: 20,\n scrapeContactInfo: true // Extracts emails from websites!\n})\n\n// Filter in code - only highly-rated with email/phone\nconst qualifiedLeads = places\n .filter(p =>\n p.rating >= 4.5 &&\n p.reviewsCount >= 100 &&\n (p.email || p.phone)\n )\n .map(p => ({\n name: p.name,\n rating: p.rating,\n reviews: p.reviewsCount,\n email: p.email,\n phone: p.phone,\n website: p.website,\n address: p.address\n }))\n\n// Export leads - only qualified results!\nconsole.log(`Found ${qualifiedLeads.length} qualified leads`)","type":"text"}]},{"type":"paragraph","content":[{"text":"Google Maps - Review sentiment analysis:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { scrapeGoogleMapsReviews } from 'actors'\n\nconst reviews = await scrapeGoogleMapsReviews({\n placeUrl: 'https://maps.google.com/maps?cid=12345',\n maxResults: 1000\n})\n\n// Filter in code - analyze sentiment by rating\nconst recentNegative = reviews\n .filter(r => {\n const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)\n return (\n r.rating \u003c= 2 &&\n new Date(r.publishedAtDate).getTime() > thirtyDaysAgo &&\n r.text.length > 50\n )\n })\n\n// Identify common complaints\nconst complaints = recentNegative.map(r => r.text)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"E-commerce & Competitive Intelligence","type":"text"}]},{"type":"paragraph","content":[{"text":"Amazon - Price monitoring:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { scrapeAmazonProduct } from 'actors'\n\nconst product = await scrapeAmazonProduct({\n productUrl: 'https://www.amazon.com/dp/B08L5VT894',\n includeReviews: true,\n maxReviews: 200\n})\n\n// Filter in code - only recent negative reviews\nconst recentNegative = product.reviews\n ?.filter(r => {\n const weekAgo = Date.now() - (7 * 24 * 60 * 60 * 1000)\n return (\n r.rating \u003c= 2 &&\n new Date(r.date).getTime() > weekAgo\n )\n })\n\nconsole.log(`Price: ${product.price}`)\nconsole.log(`Rating: ${product.rating}/5`)\nconsole.log(`Recent issues: ${recentNegative?.length} complaints`)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Custom Web Scraping","type":"text"}]},{"type":"paragraph","content":[{"text":"Any Website - Custom extraction:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { scrapeWebsite } from 'actors'\n\nconst products = await scrapeWebsite({\n startUrls: ['https://example.com/products'],\n linkSelector: 'a.product-link',\n maxPagesPerCrawl: 100,\n pageFunction: `\n async function pageFunction(context) {\n const { request, $, log } = context\n\n return {\n url: request.url,\n title: $('h1.product-title').text(),\n price: $('span.price').text(),\n inStock: $('.in-stock').length > 0,\n description: $('.description').text()\n }\n }\n `\n})\n\n// Filter in code - only available products under $100\nconst affordable = products.filter(p =>\n p.inStock &&\n parseFloat(p.price.replace('

Important: agents should read /llm.txt, /llms.txt, or /.well-known/skills.json to discover the public Skillopedia API.

, '')) \u003c 100\n)","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🎨 Advanced Patterns","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Pattern 1: Multi-Platform Social Listening","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import {\n scrapeInstagramHashtag,\n scrapeTikTokHashtag,\n searchYouTube\n} from 'actors'\n\n// Run all platforms in parallel\nconst [instagramPosts, tiktokVideos, youtubeVideos] = await Promise.all([\n scrapeInstagramHashtag({ hashtag: 'ai', maxResults: 100 }),\n scrapeTikTokHashtag({ hashtag: 'ai', maxResults: 100 }),\n searchYouTube({ query: '#ai', maxResults: 100 })\n])\n\n// Combine and filter - only viral content across all platforms\nconst allViral = [\n ...instagramPosts.filter(p => p.likesCount > 10000),\n ...tiktokVideos.filter(v => v.playCount > 100000),\n ...youtubeVideos.filter(v => v.viewsCount > 50000)\n]\n\nconsole.log(`Found ${allViral.length} viral posts across 3 platforms`)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Pattern 2: Lead Enrichment Pipeline","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import { searchGoogleMaps, scrapeLinkedInProfile } from 'actors'\n\n// 1. Find businesses on Google Maps\nconst restaurants = await searchGoogleMaps({\n query: 'restaurants in SF',\n maxResults: 100,\n scrapeContactInfo: true\n})\n\n// 2. Filter for qualified leads\nconst qualified = restaurants.filter(r =>\n r.rating >= 4.5 &&\n r.email &&\n r.reviewsCount >= 50\n)\n\n// 3. Enrich with LinkedIn data (if available)\nconst enriched = await Promise.all(\n qualified.map(async (restaurant) => {\n // Try to find LinkedIn company page\n // ... additional enrichment logic\n return restaurant\n })\n)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Pattern 3: Competitive Analysis Dashboard","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"import {\n scrapeInstagramProfile,\n scrapeYouTubeChannel,\n scrapeTikTokProfile\n} from 'actors'\n\nasync function analyzeCompetitor(username: string) {\n // Gather data from all platforms\n const [instagram, youtube, tiktok] = await Promise.all([\n scrapeInstagramProfile({ username, maxPosts: 30 }),\n scrapeYouTubeChannel({ channelUrl: `https://youtube.com/@${username}`, maxVideos: 30 }),\n scrapeTikTokProfile({ username, maxVideos: 30 })\n ])\n\n // Calculate engagement metrics in code\n return {\n username,\n instagram: {\n followers: instagram.followersCount,\n avgLikes: average(instagram.latestPosts?.map(p => p.likesCount) || []),\n engagementRate: calculateEngagement(instagram)\n },\n youtube: {\n subscribers: youtube.subscribersCount,\n avgViews: average(youtube.videos?.map(v => v.viewsCount) || [])\n },\n tiktok: {\n followers: tiktok.followersCount,\n avgPlays: average(tiktok.videos?.map(v => v.playCount) || [])\n }\n }\n}","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"💰 Token Savings Calculator","type":"text"}]},{"type":"paragraph","content":[{"text":"Example: Instagram profile with 100 posts","type":"text","marks":[{"type":"strong"}]}]},{"type":"paragraph","content":[{"text":"MCP Approach:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":""},"content":[{"text":"1. search-actors → 1,000 tokens\n2. call-actor → 1,000 tokens\n3. get-actor-output → 50,000 tokens (100 unfiltered posts)\nTOTAL: ~52,000 tokens","type":"text"}]},{"type":"paragraph","content":[{"text":"File-Based Approach:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"const profile = await scrapeInstagramProfile({\n username: 'user',\n maxPosts: 100\n})\n\n// Filter in code - only top 10 posts\nconst top = profile.latestPosts\n ?.sort((a, b) => b.likesCount - a.likesCount)\n .slice(0, 10)\n\n// TOTAL: ~500 tokens (only 10 filtered posts reach model)","type":"text"}]},{"type":"paragraph","content":[{"text":"Savings: 99% reduction (52,000 → 500 tokens)","type":"text","marks":[{"type":"strong"}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🔧 Actor Reference","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Social Media","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Instagram","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeInstagramProfile(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Profile + posts","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeInstagramPosts(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Posts from user","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeInstagramHashtag(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Posts by hashtag","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeInstagramComments(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Comments on post","type":"text"}]}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"LinkedIn","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeLinkedInProfile(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Profile + experience + email","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"searchLinkedInJobs(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Job listings","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeLinkedInPosts(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Posts from profile/company","type":"text"}]}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"TikTok","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeTikTokProfile(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Profile + videos","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeTikTokHashtag(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Videos by hashtag","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeTikTokComments(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Comments on video","type":"text"}]}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"YouTube","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeYouTubeChannel(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Channel + videos","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"searchYouTube(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Search videos","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeYouTubeComments(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Comments on video","type":"text"}]}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Facebook","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeFacebookPosts(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Posts from pages","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeFacebookGroups(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Group posts","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeFacebookComments(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Post comments","type":"text"}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Business & Lead Generation","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Google Maps","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"searchGoogleMaps(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Search places (with contact extraction!)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeGoogleMapsPlace(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Single place details","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeGoogleMapsReviews(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Place reviews","type":"text"}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"E-commerce","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Amazon","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeAmazonProduct(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Product details + reviews","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeAmazonReviews(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Product reviews only","type":"text"}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Web Scraping","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"General Web","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapeWebsite(input)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Custom multi-page crawling","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"scrapePage(url, pageFunction)","type":"text","marks":[{"type":"code_inline"}]},{"text":" - Single page extraction","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"⚙️ Configuration","type":"text"}]},{"type":"paragraph","content":[{"text":"Environment Variables:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# Required - Get from https://console.apify.com/account/integrations\nAPIFY_TOKEN=apify_api_xxxxx...","type":"text"}]},{"type":"paragraph","content":[{"text":"Actor Run Options:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"{\n memory: 2048, // MB: 128, 256, 512, 1024, 2048, 4096, 8192\n timeout: 300, // seconds\n build: 'latest' // or specific build number\n}","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🎯 When to Use This vs MCP","type":"text"}]},{"type":"paragraph","content":[{"text":"Use File-Based (this skill):","type":"text","marks":[{"type":"strong"}]}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"✅ Need to filter large datasets (>100 results)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"✅ Want to transform/aggregate data in code","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"✅ Multiple sequential operations","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"✅ Control flow (loops, conditionals)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"✅ Maximum token efficiency","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"Use MCP:","type":"text","marks":[{"type":"strong"}]}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"❌ Simple single operations with small results (\u003c10 items)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"❌ One-off exploratory queries","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"❌ Don't want to write code","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"🔗 Links","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Apify Platform: https://apify.com","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Actor Store: https://apify.com/store","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"API Docs: https://docs.apify.com/api/v2","type":"text"}]}]}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"paragraph","content":[{"text":"Remember: Filter data in code BEFORE returning to model context. This is where the 99% token savings happen!","type":"text","marks":[{"type":"strong"}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Gotchas","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Actor selection matters.","type":"text","marks":[{"type":"strong"}]},{"text":" Each social platform has specific actors — don't use a generic scraper for Instagram when a dedicated Instagram actor exists.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Rate limits vary by platform and plan.","type":"text","marks":[{"type":"strong"}]},{"text":" Check actor documentation for limits before running large scrapes.","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Scraped data format varies by actor.","type":"text","marks":[{"type":"strong"}]},{"text":" Read the actor's output schema before processing results.","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Examples","type":"text"}]},{"type":"paragraph","content":[{"text":"Example 1: Scrape Instagram profile","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":""},"content":[{"text":"User: \"get the recent posts from this Instagram account\"\n→ Selects Instagram Profile actor\n→ Runs with target profile URL\n→ Returns structured post data (text, engagement, dates)","type":"text"}]},{"type":"paragraph","content":[{"text":"Example 2: LinkedIn company scrape","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":""},"content":[{"text":"User: \"scrape this company's LinkedIn page\"\n→ Selects LinkedIn Company actor\n→ Returns company info, employee count, recent posts","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Execution Log","type":"text"}]},{"type":"paragraph","content":[{"text":"After completing any workflow, append a single JSONL entry:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"echo '{\"ts\":\"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'\",\"skill\":\"Apify\",\"workflow\":\"WORKFLOW_USED\",\"input\":\"8_WORD_SUMMARY\",\"status\":\"ok|error\",\"duration_s\":SECONDS}' >> ~/.claude/PAI/MEMORY/SKILLS/execution.jsonl","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Replace ","type":"text"},{"text":"WORKFLOW_USED","type":"text","marks":[{"type":"code_inline"}]},{"text":" with the workflow executed, ","type":"text"},{"text":"8_WORD_SUMMARY","type":"text","marks":[{"type":"code_inline"}]},{"text":" with a brief input description, and ","type":"text"},{"text":"SECONDS","type":"text","marks":[{"type":"code_inline"}]},{"text":" with approximate wall-clock time. Log ","type":"text"},{"text":"status: \"error\"","type":"text","marks":[{"type":"code_inline"}]},{"text":" if the workflow failed.","type":"text"}]}]},"metadata":{"date":"2026-06-05","name":"Apify","author":"@skillopedia","effort":"medium","source":{"stars":14561,"repo_name":"personal_ai_infrastructure","origin_url":"https://github.com/danielmiessler/personal_ai_infrastructure/blob/HEAD/Releases/v5.0.0/.claude/skills/Apify/SKILL.md","repo_owner":"danielmiessler","body_sha256":"65dd8b80c23fd0a8be921419341aaf22212b9b6cadc9fd661027ea5a38445825","cluster_key":"c90c7361d86c8497a49a22b9f9208bd1ed688bd767b13577b39feead8da40afb","clean_bundle":{"format":"clean-skill-bundle-v1","source":"danielmiessler/personal_ai_infrastructure/Releases/v5.0.0/.claude/skills/Apify/SKILL.md","attachments":[{"id":"ff54fddd-d5b6-5f79-a060-688dab59e4d0","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/ff54fddd-d5b6-5f79-a060-688dab59e4d0/attachment","path":".gitignore","size":39,"sha256":"56fe886c0783fec92d06c418144f62f79a02f030c06bfb99b7dec72be710821c","contentType":"text/plain; charset=utf-8"},{"id":"1f0cc907-0f71-534b-92c2-1a49535d7c1c","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/1f0cc907-0f71-534b-92c2-1a49535d7c1c/attachment.md","path":"INTEGRATION.md","size":6373,"sha256":"90ff4dfc1d1cc305daa3e87693a1842c96c289762dca5bafb9969a1ce5827e16","contentType":"text/markdown; charset=utf-8"},{"id":"744aed69-7567-5e71-b86d-475abefcd929","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/744aed69-7567-5e71-b86d-475abefcd929/attachment.md","path":"README.md","size":8445,"sha256":"52bcefbef5fdc8d1f5ec4f034884f086bb04a9e203ef51da76e2e2722a33f1ea","contentType":"text/markdown; charset=utf-8"},{"id":"793265aa-86e5-5ac2-aff6-c38d6e50d18b","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/793265aa-86e5-5ac2-aff6-c38d6e50d18b/attachment.md","path":"Workflows/Update.md","size":1932,"sha256":"2d7bfc8a63f3967c0c958881b7778fe693ea52be858f08f7f8c50b4bbc9523c0","contentType":"text/markdown; charset=utf-8"},{"id":"f446e96f-f044-55fb-bb2e-704e2e23c725","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/f446e96f-f044-55fb-bb2e-704e2e23c725/attachment.ts","path":"actors/business/google-maps.ts","size":11520,"sha256":"1bfe4eb2820980c67e4ef57666642392815eacdb0029b8158261df6dc4745be2","contentType":"text/typescript; charset=utf-8"},{"id":"3faac1c6-9532-58fc-b790-71b7d18af74d","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/3faac1c6-9532-58fc-b790-71b7d18af74d/attachment.ts","path":"actors/business/index.ts","size":126,"sha256":"3c200940d341b7afa7f427f025e2949c95dd436c19bb34e4d39a54e2aac8f608","contentType":"text/typescript; charset=utf-8"},{"id":"802fdc34-91fc-5f31-bfae-79958f4b55c0","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/802fdc34-91fc-5f31-bfae-79958f4b55c0/attachment.ts","path":"actors/ecommerce/amazon.ts","size":6526,"sha256":"8a474cf08694e65a83841b535d6a651e4b148d198d48520120360605d4be5c06","contentType":"text/typescript; charset=utf-8"},{"id":"c89a16f8-2a85-544d-a6fd-50a4432b1405","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c89a16f8-2a85-544d-a6fd-50a4432b1405/attachment.ts","path":"actors/ecommerce/index.ts","size":99,"sha256":"62d60f833cb2a9db8d4970e3eb7bc1d015c537e9967c51b78c5b9d79f8ede8cc","contentType":"text/typescript; charset=utf-8"},{"id":"b1838c12-9c11-5220-a189-6bcb6ab867ba","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/b1838c12-9c11-5220-a189-6bcb6ab867ba/attachment.ts","path":"actors/index.ts","size":1520,"sha256":"d24d3ee5f1fe955848815fa0f3a51a77fbb9f930acb29b65a280dd72bcf580d0","contentType":"text/typescript; charset=utf-8"},{"id":"41bfb26c-d241-5d37-a142-333ce71d8047","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/41bfb26c-d241-5d37-a142-333ce71d8047/attachment.ts","path":"actors/social-media/facebook.ts","size":7149,"sha256":"bca2f7988fc43f89c03ed30432aef7539d0e5c9876e8b8d830bc9187c6a1aea2","contentType":"text/typescript; charset=utf-8"},{"id":"9ebd895e-8139-5241-a1df-52861ea81fa0","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/9ebd895e-8139-5241-a1df-52861ea81fa0/attachment.ts","path":"actors/social-media/index.ts","size":403,"sha256":"90ca00741ca2058e8a7f92540908f5204ba9b5aad76d741b5cd3f72f4860d97f","contentType":"text/typescript; charset=utf-8"},{"id":"867a42f4-375f-5b4e-80a1-b208b30dab1e","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/867a42f4-375f-5b4e-80a1-b208b30dab1e/attachment.ts","path":"actors/social-media/instagram.ts","size":9689,"sha256":"e71b18b711ebe761833623dc62a700c99ac7221c2e9cea910f74cddd46a5ff59","contentType":"text/typescript; charset=utf-8"},{"id":"124afae8-17cf-5a1c-9b7e-2d591006d973","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/124afae8-17cf-5a1c-9b7e-2d591006d973/attachment.ts","path":"actors/social-media/linkedin.ts","size":8615,"sha256":"e2207e1140fa39f11459b5805c61e30b251d6dae62a2e8ada30ddf61b2c0f7c7","contentType":"text/typescript; charset=utf-8"},{"id":"3204e26c-ff51-5db0-b065-ac817e05d5d4","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/3204e26c-ff51-5db0-b065-ac817e05d5d4/attachment.ts","path":"actors/social-media/tiktok.ts","size":7731,"sha256":"69413cc9ed12fbdcf5edacc120f44e5d05ac753ed01184a6c054acd53edefb84","contentType":"text/typescript; charset=utf-8"},{"id":"c8d46cd4-e2ab-54f5-9306-1bf2fce552b7","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c8d46cd4-e2ab-54f5-9306-1bf2fce552b7/attachment.ts","path":"actors/social-media/twitter.ts","size":8295,"sha256":"c80173d5390b91f8e0e8f907d378757773f23cfc8cb487796e5f3adbef50a525","contentType":"text/typescript; charset=utf-8"},{"id":"45dac806-c435-5c06-ae01-ec3102740a5b","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/45dac806-c435-5c06-ae01-ec3102740a5b/attachment.ts","path":"actors/social-media/youtube.ts","size":8167,"sha256":"74648bf89b3db6301158d1112c2b0f1c5ba3a3c074a138f4142dd9e585840bf5","contentType":"text/typescript; charset=utf-8"},{"id":"ba1cc00f-68d4-5777-bcba-6cca4800f4d9","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/ba1cc00f-68d4-5777-bcba-6cca4800f4d9/attachment.ts","path":"actors/web/index.ts","size":112,"sha256":"d8ebb9d8a5cd94a71f33e3c8fddb12c63e41580eb83ec02add0395159545549f","contentType":"text/typescript; charset=utf-8"},{"id":"cbdbf493-dff6-575d-b7b7-4498a32fca48","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/cbdbf493-dff6-575d-b7b7-4498a32fca48/attachment.ts","path":"actors/web/web-scraper.ts","size":6020,"sha256":"37db669ab59d8abffcb52ba2bef2a8699d5a87ef429b3a6471404d56d71d822e","contentType":"text/typescript; charset=utf-8"},{"id":"c2de5418-a1c2-59dc-92ca-f5686b737131","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c2de5418-a1c2-59dc-92ca-f5686b737131/attachment.ts","path":"examples/comparison-test.ts","size":8080,"sha256":"04bd05632154fc95bda3041f5493689a9f48b22803fd4b1c5ba1035642c75499","contentType":"text/typescript; charset=utf-8"},{"id":"f290f2f2-49ff-57c3-9b79-e00169196b82","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/f290f2f2-49ff-57c3-9b79-e00169196b82/attachment.ts","path":"examples/instagram-scraper.ts","size":4616,"sha256":"eb19d922183e448e90693ff90414dac066c642bde2e7308f190bc639c1223e9c","contentType":"text/typescript; charset=utf-8"},{"id":"867ad2a1-6081-554c-b529-3891f5d77c23","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/867ad2a1-6081-554c-b529-3891f5d77c23/attachment.ts","path":"examples/smoke-test.ts","size":2302,"sha256":"c1166bf4540539ae8ec38870d70bb0fb70d919de971f2b46b2351790200a6050","contentType":"text/typescript; charset=utf-8"},{"id":"4b1a9879-c1a6-51cb-b12d-2c113f843bdd","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/4b1a9879-c1a6-51cb-b12d-2c113f843bdd/attachment.ts","path":"index.ts","size":6388,"sha256":"7a9c8daafce3cfc8e7dc3ee734185c0e8d98d49106632adc01174167f20b9fca","contentType":"text/typescript; charset=utf-8"},{"id":"4bfebbdf-a2b6-5999-a6a7-0417eccc4803","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/4bfebbdf-a2b6-5999-a6a7-0417eccc4803/attachment.json","path":"package.json","size":407,"sha256":"0bda74058ca6f67acfe94a69f59a637c84c262051822c49040da35a209e4b347","contentType":"application/json; charset=utf-8"},{"id":"328b962f-00cc-5a3e-bd8d-bb9ebc7c6a34","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/328b962f-00cc-5a3e-bd8d-bb9ebc7c6a34/attachment.ts","path":"skills/get-user-tweets.ts","size":3467,"sha256":"ecd7105bee66f68bf6cf8e1d7dbf229c3d31d1af81ec368807095b9a92236fd4","contentType":"text/typescript; charset=utf-8"},{"id":"2a11d5bf-d7f8-50d3-ae04-b01da5900b13","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/2a11d5bf-d7f8-50d3-ae04-b01da5900b13/attachment.json","path":"tsconfig.json","size":502,"sha256":"3c3c1714e30747b6e0ff6122f0abd0691beeea4c09c21e6a90fa306758d36222","contentType":"application/json; charset=utf-8"},{"id":"fc65cd36-90c5-5621-9432-77102a5b2b85","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/fc65cd36-90c5-5621-9432-77102a5b2b85/attachment.ts","path":"types/common.ts","size":2344,"sha256":"109258ba3dcc6c380e4a071faeedafd3489bbf0ae23005dac937f4360114ede5","contentType":"text/typescript; charset=utf-8"},{"id":"50f7bff9-9ce5-52fa-a6e8-ad8c2009edaa","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/50f7bff9-9ce5-52fa-a6e8-ad8c2009edaa/attachment.ts","path":"types/index.ts","size":71,"sha256":"751b938cb3e1ffbfb79dfb22642672f4099a55e889adfdf3095e32f12da0dd1c","contentType":"text/typescript; charset=utf-8"}],"bundle_sha256":"03e83be06d88336d94d70600b53bdaae63344ea71e87ed7f0578eff7ac1bac18","attachment_count":27,"text_attachments":26,"attachment_storage":"skillopedia-attachments-v1","binary_attachments":1,"excluded_attachments":[]},"cluster_size":2,"skill_md_path":"Releases/v5.0.0/.claude/skills/Apify/SKILL.md","import_metadata":{"date":"2026-06-05","author":"@skillopedia","version":"v1","category":"browser-automation-scraping","category_label":"Browser"},"exact_dupes_collapsed_into_this":1},"version":"v1","category":"browser-automation-scraping","import_tag":"clean-skills-v1","description":"Scrape social media platforms, business data, and e-commerce via Apify actors — Instagram profiles/posts/hashtags/comments, LinkedIn profiles/jobs/posts, TikTok profiles/hashtags/videos/comments, YouTube channels/search/comments, Facebook posts/groups/comments, Google Maps business search with contact/review/image extraction, Amazon products/reviews/pricing, and general-purpose multi-page web crawling with custom pageFunction extraction logic. File-based TypeScript wrappers (scrapeInstagramProfile, searchGoogleMaps, scrapeAmazonProduct, scrapeWebsite, etc.) filter and transform data in code before returning to model context, achieving 95-99% token savings over direct MCP protocol. Parallel multi-platform queries via Promise.all for social listening dashboards. Lead enrichment pipeline: Google Maps → qualified filter → optional LinkedIn enrichment. Competitive analysis across Instagram, YouTube, and TikTok simultaneously. USE WHEN scrape Instagram, scrape LinkedIn, scrape TikTok, scrape YouTube, scrape Facebook, Google Maps leads, Amazon reviews, business intelligence, multi-platform social listening, competitive analysis, lead generation, social monitoring, Apify actors, web crawl, extract contacts. NOT FOR X/Twitter bookmarks (use a dedicated X-API skill) or progressive scraping (use BrightData)."}},"renderedAt":1782981526834}

Important: agents should read /llm.txt, /llms.txt, or /.well-known/skills.json to discover the public Skillopedia API.