DocsJobs API
Jobs API
Complete reference for the Jobs API endpoints to create, manage, and monitor scraping jobs.
The Jobs API allows you to programmatically create, update, and manage scraping jobs. All endpoints require authentication via Bearer token.
Base URL
https://api.snowscrape.comEndpoints Overview
| Method | Endpoint | Description |
|---|---|---|
| GET | /jobs/status | List all jobs |
| GET | /jobs/{id} | Get job details |
| POST | /jobs | Create a new job |
| PUT | /jobs/{id} | Update a job |
| DELETE | /jobs/{id} | Delete a job |
| POST | /jobs/{id}/run | Trigger job immediately |
| POST | /jobs/{id}/pause | Pause a running job |
| POST | /jobs/{id}/resume | Resume a paused job |
| GET | /jobs/{id}/results | Get job results |
| GET | /jobs/{id}/download | Download results as file |
List All Jobs
GET
/jobs/statusReturns a list of all jobs for the authenticated user.
Response
{
"jobs": [
{
"job_id": "job_abc123",
"name": "Product Scraper",
"source": "https://example.com/products",
"status": "success",
"created_at": "2024-01-15T10:00:00Z",
"last_run": "2024-01-20T14:30:00Z",
"results_count": 150
}
],
"count": 1
}Create a Job
POST
/jobsCreates a new scraping job with the specified configuration.
Request Body
{
"name": "Product Prices",
"source": "https://example.com/products",
"rate_limit": 10,
"queries": [
{
"name": "title",
"type": "xpath",
"query": "//h1[@class='product-title']/text()"
},
{
"name": "price",
"type": "css",
"query": ".price-value::text"
}
],
"scheduling": {
"days": [1, 2, 3, 4, 5],
"hours": [9],
"minutes": [0]
},
"render_config": {
"enabled": true,
"wait_strategy": "networkidle",
"wait_timeout_ms": 5000
},
"proxy_config": {
"enabled": true,
"rotation_strategy": "round-robin",
"geo_targeting": "US"
}
}Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| name | string | Yes | Job display name |
| source | string | Yes | URL to scrape |
| queries | array | Yes | Data extraction queries |
| rate_limit | number | No | Requests per minute (default: 10) |
| scheduling | object | No | Cron-like schedule config |
| render_config | object | No | JavaScript rendering options |
| proxy_config | object | No | Proxy rotation settings |
Download Results
GET
/jobs/{id}/download?format=jsonDownloads job results in the specified format.
Query Parameters
format: json, csv, xlsx, parquet, sql
Job Status Values
runningJob is executing
successCompleted successfully
failedExecution failed
pausedManually paused
scheduledWaiting for schedule
Error Responses
// 400 Bad Request
{
"error": "validation_error",
"message": "Invalid query configuration",
"details": {
"queries[0].type": "Must be one of: xpath, css, regex"
}
}
// 404 Not Found
{
"error": "not_found",
"message": "Job not found"
}
// 429 Too Many Requests
{
"error": "rate_limited",
"message": "API rate limit exceeded",
"retry_after": 60
}Code Examples
Create and Run a Job (Python)
import requests
import os
API_KEY = os.environ['SNOWSCRAPE_API_KEY']
BASE_URL = 'https://api.snowscrape.com'
# Create a job
job_config = {
"name": "Example Scraper",
"source": "https://example.com/page",
"queries": [
{"name": "title", "type": "xpath", "query": "//h1/text()"},
{"name": "content", "type": "css", "query": "article p::text", "join": True}
]
}
response = requests.post(
f'{BASE_URL}/jobs',
json=job_config,
headers={'Authorization': f'Bearer {API_KEY}'}
)
job = response.json()
print(f"Created job: {job['job_id']}")
# Trigger the job
requests.post(
f'{BASE_URL}/jobs/{job["job_id"]}/run',
headers={'Authorization': f'Bearer {API_KEY}'}
)