Test Data Factory Writing test setup is slower than writing the test itself. You have a model with 12 fields, a model that requires a User, and an model that requires both. Every test file re-invents the same boilerplate — with slightly different hardcoded values that don't cover edge cases. Paste your schema and get a complete factory module: realistic Faker-powered defaults for every field, relationship-aware ordering, and one-line overrides for specific test scenarios. Reads any schema format. Outputs TypeScript, Python, or SQL. Zero external APIs. --- Trigger Phrases - "generate test data…

, col_line, re.IGNORECASE)\n if not col_match:\n continue\n\n fname = col_match.group(1)\n ftype = col_match.group(2).upper()\n rest = col_match.group(4).upper()\n is_nullable = 'NOT NULL' not in rest\n is_auto = 'AUTO_INCREMENT' in rest or 'SERIAL' in ftype\n is_unique = 'UNIQUE' in rest\n\n fields.append(PrismaField(\n name=fname,\n type=ftype,\n is_optional=is_nullable,\n is_auto=is_auto,\n is_unique=is_unique,\n ))\n\n models[table_name] = fields\n\n return {'models': models, 'enums': {}}\n```\n\n---\n\n## Step 2: Map Types to Faker Functions\n\n```python\n# Prisma/TypeScript type → Faker.js function\nFAKER_JS_MAP = {\n # Primitives\n 'String': 'faker.lorem.words(3)',\n 'Int': 'faker.number.int({ min: 1, max: 10000 })',\n 'Float': 'faker.number.float({ min: 0, max: 1000, fractionDigits: 2 })',\n 'Boolean': 'faker.datatype.boolean()',\n 'DateTime': 'faker.date.recent({ days: 30 })',\n 'BigInt': 'BigInt(faker.number.int({ min: 1, max: 1000000 }))',\n 'Json': '{}',\n 'Bytes': 'Buffer.from(faker.string.alphanumeric(16))',\n\n # Semantic overrides (based on field name)\n 'email': 'faker.internet.email()',\n 'name': 'faker.person.fullName()',\n 'firstName': 'faker.person.firstName()',\n 'lastName': 'faker.person.lastName()',\n 'username': 'faker.internet.username()',\n 'password': 'faker.internet.password({ length: 12 })',\n 'phone': 'faker.phone.number()',\n 'address': 'faker.location.streetAddress()',\n 'city': 'faker.location.city()',\n 'country': 'faker.location.country()',\n 'zipCode': 'faker.location.zipCode()',\n 'url': 'faker.internet.url()',\n 'imageUrl': 'faker.image.url()',\n 'avatar': 'faker.image.avatar()',\n 'bio': 'faker.lorem.paragraph()',\n 'description': 'faker.lorem.sentences(2)',\n 'title': 'faker.lorem.sentence()',\n 'slug': 'faker.helpers.slugify(faker.lorem.words(3))',\n 'color': 'faker.color.human()',\n 'uuid': 'faker.string.uuid()',\n 'ip': 'faker.internet.ip()',\n 'createdAt': 'faker.date.past({ years: 1 })',\n 'updatedAt': 'new Date()',\n 'deletedAt': 'null',\n 'publishedAt': 'faker.date.recent({ days: 90 })',\n 'price': 'faker.number.float({ min: 0.99, max: 999.99, fractionDigits: 2 })',\n 'amount': 'faker.number.int({ min: 1, max: 10000 })',\n 'quantity': 'faker.number.int({ min: 1, max: 100 })',\n 'score': 'faker.number.float({ min: 0, max: 5, fractionDigits: 1 })',\n 'rating': 'faker.number.int({ min: 1, max: 5 })',\n 'status': None, # replaced by enum values\n 'role': None, # replaced by enum values\n 'type': None, # replaced by enum values\n}\n\n# Same mapping for Python Faker\nFAKER_PY_MAP = {\n 'String': \"fake.sentence(nb_words=3)\",\n 'str': \"fake.sentence(nb_words=3)\",\n 'Int': \"fake.random_int(min=1, max=10000)\",\n 'int': \"fake.random_int(min=1, max=10000)\",\n 'Float': \"round(random.uniform(0, 1000), 2)\",\n 'float': \"round(random.uniform(0, 1000), 2)\",\n 'bool': \"fake.boolean()\",\n 'datetime': \"fake.date_time_this_year()\",\n 'email': \"fake.email()\",\n 'name': \"fake.name()\",\n 'phone': \"fake.phone_number()\",\n 'url': \"fake.url()\",\n 'uuid': \"str(uuid.uuid4())\",\n 'price': \"round(random.uniform(0.99, 999.99), 2)\",\n}\n\ndef get_faker_value(field_name: str, field_type: str, enum_values: list, lang: str = 'ts') -> str:\n \"\"\"Get the Faker expression for a field.\"\"\"\n mapper = FAKER_JS_MAP if lang == 'ts' else FAKER_PY_MAP\n prefix = 'faker.' if lang == 'ts' else 'fake.'\n\n # Enum field: pick from enum values\n if enum_values:\n if lang == 'ts':\n return f'faker.helpers.arrayElement([{\", \".join(repr(v) for v in enum_values)}])'\n else:\n return f'random.choice([{\", \".join(repr(v) for v in enum_values)}])'\n\n # Check semantic field name first\n for semantic_key, expr in mapper.items():\n if field_name.lower().endswith(semantic_key.lower()) or field_name.lower() == semantic_key.lower():\n if expr:\n return expr\n\n # Fall back to type mapping\n return mapper.get(field_type, f'\"TODO: {field_type}\"')\n```\n\n---\n\n## Step 3: Detect Relationship Order\n\nTopologically sort models so parents are created before children:\n\n```python\ndef topological_sort(models: dict) -> list[str]:\n \"\"\"Return model names in dependency order (parents first).\"\"\"\n from collections import defaultdict, deque\n\n graph = defaultdict(list)\n in_degree = {name: 0 for name in models}\n\n for model_name, fields in models.items():\n for field in fields:\n if field.relation and field.relation in models and not field.is_optional:\n # model_name depends on field.relation\n graph[field.relation].append(model_name)\n in_degree[model_name] += 1\n\n queue = deque([m for m, d in in_degree.items() if d == 0])\n order = []\n\n while queue:\n model = queue.popleft()\n order.append(model)\n for dependent in graph[model]:\n in_degree[dependent] -= 1\n if in_degree[dependent] == 0:\n queue.append(dependent)\n\n # Append any remaining (circular deps)\n for m in models:\n if m not in order:\n order.append(m)\n\n return order\n```\n\n---\n\n## Step 4: Generate Factory Code\n\n### TypeScript Output (Faker.js)\n\n```typescript\n// Generated by phy-test-data-factory\n// Install: npm install -D @faker-js/faker\n\nimport { faker } from '@faker-js/faker';\nimport { PrismaClient, UserRole, PostStatus } from '@prisma/client';\n\nconst prisma = new PrismaClient();\n\n// ─── User Factory ────────────────────────────────────────────────────────────\n\nexport interface CreateUserOptions {\n id?: string;\n email?: string;\n name?: string;\n role?: UserRole;\n createdAt?: Date;\n}\n\nexport function buildUser(overrides: CreateUserOptions = {}) {\n return {\n id: faker.string.uuid(),\n email: faker.internet.email(),\n name: faker.person.fullName(),\n username: faker.internet.username(),\n password: faker.internet.password({ length: 12 }),\n bio: faker.lorem.paragraph(),\n avatarUrl: faker.image.avatar(),\n role: faker.helpers.arrayElement(['USER', 'ADMIN', 'MODERATOR'] as UserRole[]),\n isActive: true,\n createdAt: faker.date.past({ years: 1 }),\n updatedAt: new Date(),\n ...overrides,\n };\n}\n\nexport async function createUser(overrides: CreateUserOptions = {}) {\n return prisma.user.create({ data: buildUser(overrides) });\n}\n\n// ─── Post Factory ─────────────────────────────────────────────────────────────\n\nexport interface CreatePostOptions {\n id?: string;\n title?: string;\n content?: string;\n authorId?: string; // Will create a User if not provided\n status?: PostStatus;\n}\n\nexport async function createPost(overrides: CreatePostOptions = {}) {\n const authorId = overrides.authorId ?? (await createUser()).id;\n return prisma.post.create({\n data: {\n id: faker.string.uuid(),\n title: faker.lorem.sentence(),\n slug: faker.helpers.slugify(faker.lorem.words(4)),\n content: faker.lorem.paragraphs(3),\n excerpt: faker.lorem.sentences(2),\n status: faker.helpers.arrayElement(['DRAFT', 'PUBLISHED', 'ARCHIVED'] as PostStatus[]),\n publishedAt: faker.date.recent({ days: 90 }),\n authorId,\n createdAt: faker.date.past({ years: 1 }),\n updatedAt: new Date(),\n ...overrides,\n },\n });\n}\n\n// ─── Order Factory ────────────────────────────────────────────────────────────\n\nexport interface CreateOrderOptions {\n id?: string;\n userId?: string;\n total?: number;\n status?: 'PENDING' | 'CONFIRMED' | 'SHIPPED' | 'DELIVERED' | 'CANCELLED';\n}\n\nexport async function createOrder(overrides: CreateOrderOptions = {}) {\n const userId = overrides.userId ?? (await createUser()).id;\n return prisma.order.create({\n data: {\n id: faker.string.uuid(),\n userId,\n total: faker.number.float({ min: 9.99, max: 999.99, fractionDigits: 2 }),\n status: faker.helpers.arrayElement(['PENDING', 'CONFIRMED', 'SHIPPED', 'DELIVERED', 'CANCELLED']),\n address: faker.location.streetAddress(),\n city: faker.location.city(),\n country: faker.location.country(),\n createdAt: faker.date.past({ years: 1 }),\n updatedAt: new Date(),\n ...overrides,\n },\n });\n}\n\n// ─── Bulk creation helpers ────────────────────────────────────────────────────\n\nexport async function createUsers(count: number, overrides: CreateUserOptions = {}) {\n return Promise.all(Array.from({ length: count }, () => createUser(overrides)));\n}\n\nexport async function createPosts(count: number, overrides: CreatePostOptions = {}) {\n return Promise.all(Array.from({ length: count }, () => createPost(overrides)));\n}\n\n// ─── Teardown ─────────────────────────────────────────────────────────────────\n\nexport async function clearTestData() {\n // Delete in reverse dependency order (children before parents)\n await prisma.order.deleteMany();\n await prisma.post.deleteMany();\n await prisma.user.deleteMany();\n}\n```\n\n### Python Output (factory_boy)\n\n```python\n# Generated by phy-test-data-factory\n# Install: pip install factory_boy faker\n\nimport uuid, random\nfrom datetime import datetime\nimport factory\nfrom factory import Faker, SubFactory, LazyFunction\nfrom factory.django import DjangoModelFactory # or SQLAlchemyModelFactory\nfrom myapp.models import User, Post, Order, UserRole, PostStatus\n\nclass UserFactory(DjangoModelFactory):\n class Meta:\n model = User\n\n id = LazyFunction(lambda: str(uuid.uuid4()))\n email = Faker('email')\n name = Faker('name')\n username = Faker('user_name')\n bio = Faker('paragraph')\n avatar_url = Faker('image_url')\n role = factory.Iterator([r.value for r in UserRole])\n is_active = True\n created_at = Faker('date_time_this_year')\n updated_at = LazyFunction(datetime.utcnow)\n\nclass PostFactory(DjangoModelFactory):\n class Meta:\n model = Post\n\n id = LazyFunction(lambda: str(uuid.uuid4()))\n title = Faker('sentence', nb_words=6)\n slug = factory.LazyAttribute(lambda o: o.title.lower().replace(' ', '-').replace(',', ''))\n content = Faker('paragraphs', nb=3, as_list=False)\n excerpt = Faker('sentences', nb=2, as_list=False)\n status = factory.Iterator([s.value for s in PostStatus])\n author = SubFactory(UserFactory)\n published_at = Faker('date_time_this_month')\n created_at = Faker('date_time_this_year')\n updated_at = LazyFunction(datetime.utcnow)\n\nclass OrderFactory(DjangoModelFactory):\n class Meta:\n model = Order\n\n id = LazyFunction(lambda: str(uuid.uuid4()))\n user = SubFactory(UserFactory)\n total = LazyFunction(lambda: round(random.uniform(9.99, 999.99), 2))\n status = factory.Iterator(['PENDING', 'CONFIRMED', 'SHIPPED', 'DELIVERED'])\n address = Faker('street_address')\n city = Faker('city')\n country = Faker('country')\n created_at = Faker('date_time_this_year')\n updated_at = LazyFunction(datetime.utcnow)\n\n\n# Usage in pytest conftest.py:\n#\n# @pytest.fixture\n# def user(db):\n# return UserFactory()\n#\n# @pytest.fixture\n# def post_with_author(db):\n# return PostFactory() # auto-creates a User via SubFactory\n#\n# @pytest.fixture\n# def many_orders(db):\n# return OrderFactory.create_batch(20)\n```\n\n### SQL Seed Output\n\n```sql\n-- Generated by phy-test-data-factory\n-- Seed data for: users, posts, orders\n-- Insert in dependency order (parents first)\n\n-- Users (10 rows)\nINSERT INTO users (id, email, name, role, is_active, created_at) VALUES\n ('usr_001', '[email protected]', 'Alice Johnson', 'USER', true, '2026-01-15 09:30:00'),\n ('usr_002', '[email protected]', 'Bob Smith', 'ADMIN', true, '2026-01-20 14:00:00'),\n ('usr_003', '[email protected]', 'Carol Williams', 'USER', true, '2026-02-01 11:00:00'),\n -- ... (7 more rows)\n\n-- Posts (20 rows, requires users above)\nINSERT INTO posts (id, title, slug, status, author_id, created_at) VALUES\n ('post_001', 'Getting Started with Testing', 'getting-started-testing', 'PUBLISHED', 'usr_001', '2026-02-10 10:00:00'),\n ('post_002', 'Advanced Patterns in TypeScript', 'advanced-typescript', 'DRAFT', 'usr_002', '2026-02-15 11:30:00'),\n -- ... (18 more rows)\n```\n\n---\n\n## Step 5: Output Report\n\n```markdown\n## Test Data Factory — Generated\nSchema: prisma/schema.prisma | Models: User, Post, Comment, Order, Tag\nOutput: src/test/factories/index.ts\n\n---\n\n### Models Processed (dependency order)\n\n| Model | Fields | Relationships | Factory Type |\n|-------|--------|--------------|-------------|\n| User | 14 fields | — (root) | createUser() |\n| Tag | 4 fields | — (root) | createTag() |\n| Post | 11 fields | → User (author) | createPost() |\n| Comment | 8 fields | → User, → Post | createComment() |\n| Order | 9 fields | → User | createOrder() |\n\n---\n\n### Generated Files\n\n- `src/test/factories/index.ts` — all factory functions\n- `src/test/factories/builders.ts` — plain object builders (no DB write)\n- `src/test/setup.ts` — jest/vitest beforeAll/afterAll with clearTestData()\n\n---\n\n### Auto-Detected Semantic Mappings\n\n| Field | Detected As | Faker Function Used |\n|-------|------------|---------------------|\n| `email` | Email address | `faker.internet.email()` |\n| `avatarUrl` | Image URL | `faker.image.avatar()` |\n| `publishedAt` | Recent date | `faker.date.recent({ days: 90 })` |\n| `role` | Enum (USER/ADMIN/MOD) | `faker.helpers.arrayElement([...])` |\n| `slug` | URL slug | `faker.helpers.slugify(faker.lorem.words(3))` |\n| `price` | Currency amount | `faker.number.float({ fractionDigits: 2 })` |\n\n---\n\n### Quick Usage\n\n```typescript\nimport { createUser, createPost, createOrder, clearTestData } from './factories';\n\n// Single record\nconst user = await createUser();\n\n// With overrides\nconst adminUser = await createUser({ role: 'ADMIN', email: '[email protected]' });\n\n// Relationships handled automatically\nconst post = await createPost(); // creates a User internally\nconst post2 = await createPost({ authorId: user.id }); // reuse existing User\n\n// Batch creation\nconst orders = await createOrders(50);\n\n// Teardown\nafterAll(clearTestData);\n```\n```\n\n---\n\n## Edge Case Variants\n\nWith `--edge-cases`, generate additional factory variants for boundary testing:\n\n```typescript\n// Generated edge-case builders for User model:\n\nexport const edgeCaseUsers = {\n withMinLengthFields: () => buildUser({\n email: '[email protected]',\n name: 'A',\n bio: '',\n }),\n withMaxLengthFields: () => buildUser({\n email: 'a'.repeat(243) + '@b.co', // 255 chars total\n name: 'A'.repeat(255),\n bio: 'x'.repeat(5000),\n }),\n withNullableFieldsNull: () => buildUser({\n bio: null,\n avatarUrl: null,\n phoneNumber: null,\n }),\n withSpecialCharacters: () => buildUser({\n name: \"O'Brien-Smith, Jr.\",\n bio: '\u003cscript>alert(\"xss\")\u003c/script>', // for XSS testing\n }),\n withUnicodeContent: () => buildUser({\n name: '张伟',\n bio: '日本語テキスト with emoji 🎉',\n }),\n withPastDates: () => buildUser({\n createdAt: new Date('2000-01-01'),\n }),\n withFutureDates: () => buildUser({\n createdAt: new Date('2099-12-31'),\n }),\n};\n```\n\n---\n\n## Install Dependencies\n\n```bash\n# TypeScript / JavaScript\nnpm install -D @faker-js/faker\n\n# Python (Django)\npip install factory_boy faker\n\n# Python (SQLAlchemy)\npip install factory_boy faker sqlalchemy\n\n# Verify installation\nnode -e \"const { faker } = require('@faker-js/faker'); console.log(faker.person.fullName())\"\npython3 -c \"import factory; print('factory_boy ready')\"\n```\n---","attachment_filenames":["_meta.json"],"attachments":[{"filename":"_meta.json","content":"{\n \"owner\": \"phy041\",\n \"slug\": \"phy-test-data-factory\",\n \"displayName\": \"Phy Test Data Factory\",\n \"latest\": {\n \"version\": \"1.0.0\",\n \"publishedAt\": 1773796098231,\n \"commit\": \"https://github.com/openclaw/skills/commit/6d9f5f72d8553b784b186fe07d479da8a1094d4d\"\n },\n \"history\": []\n}\n","content_type":"application/json; charset=utf-8","language":"json","size":294,"content_sha256":"50e2dceef8bff3162e02f13b8b367f78d762e7f67477129e96fd89ea68303e70"}],"content_json":{"type":"doc","content":[{"type":"heading","attrs":{"level":1},"content":[{"text":"Test Data Factory","type":"text"}]},{"type":"paragraph","content":[{"text":"Writing test setup is slower than writing the test itself. You have a ","type":"text"},{"text":"User","type":"text","marks":[{"type":"code_inline"}]},{"text":" model with 12 fields, a ","type":"text"},{"text":"Post","type":"text","marks":[{"type":"code_inline"}]},{"text":" model that requires a User, and an ","type":"text"},{"text":"Order","type":"text","marks":[{"type":"code_inline"}]},{"text":" model that requires both. Every test file re-invents the same ","type":"text"},{"text":"createTestUser()","type":"text","marks":[{"type":"code_inline"}]},{"text":" boilerplate — with slightly different hardcoded values that don't cover edge cases.","type":"text"}]},{"type":"paragraph","content":[{"text":"Paste your schema and get a complete factory module: realistic Faker-powered defaults for every field, relationship-aware ordering, and one-line overrides for specific test scenarios.","type":"text"}]},{"type":"paragraph","content":[{"text":"Reads any schema format. Outputs TypeScript, Python, or SQL. Zero external APIs.","type":"text","marks":[{"type":"strong"}]}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Trigger Phrases","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"\"generate test data\", \"seed my database\", \"test fixtures\"","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"\"factory functions\", \"fake data from schema\", \"test data setup\"","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"\"create test factories\", \"Faker from schema\", \"factory_boy setup\"","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"\"generate seed data\", \"populate test database\"","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"\"I need fake users/orders/products for testing\"","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"\"/test-data-factory\"","type":"text"}]}]}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"How to Provide Input","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# Option 1: Prisma schema\n/test-data-factory schema.prisma\n/test-data-factory prisma/schema.prisma\n\n# Option 2: SQLAlchemy / Django models file\n/test-data-factory models.py\n/test-data-factory app/models.py\n\n# Option 3: TypeORM entities directory\n/test-data-factory src/entities/\n\n# Option 4: Zod schemas file\n/test-data-factory src/schemas/user.schema.ts\n\n# Option 5: Raw SQL DDL\n/test-data-factory --sql migrations/001_initial.sql\n\n# Option 6: Output format override\n/test-data-factory schema.prisma --output typescript\n/test-data-factory models.py --output python\n/test-data-factory schema.prisma --output sql\n\n# Option 7: Include edge-case variants\n/test-data-factory schema.prisma --edge-cases\n\n# Option 8: Specific count\n/test-data-factory schema.prisma --count 50","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Step 1: Detect and Parse Schema","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Prisma Schema Parser","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"import re\nfrom dataclasses import dataclass, field\nfrom typing import Any\n\n@dataclass\nclass PrismaField:\n name: str\n type: str\n is_optional: bool = False\n is_list: bool = False\n is_id: bool = False\n is_unique: bool = False\n is_auto: bool = False\n default: Any = None\n relation: str | None = None\n enum_values: list[str] = field(default_factory=list)\n\ndef parse_prisma_schema(schema_text: str) -> dict:\n \"\"\"Parse Prisma schema into model definitions.\"\"\"\n models = {}\n enums = {}\n\n # Parse enums first\n for enum_match in re.finditer(r'enum\\s+(\\w+)\\s*\\{([^}]+)\\}', schema_text, re.DOTALL):\n enum_name = enum_match.group(1)\n values = [v.strip() for v in enum_match.group(2).split('\\n')\n if v.strip() and not v.strip().startswith('//')]\n enums[enum_name] = values\n\n # Parse models\n for model_match in re.finditer(r'model\\s+(\\w+)\\s*\\{([^}]+)\\}', schema_text, re.DOTALL):\n model_name = model_match.group(1)\n body = model_match.group(2)\n fields = []\n\n for line in body.split('\\n'):\n line = line.strip()\n if not line or line.startswith('//') or line.startswith('@@'):\n continue\n # Parse field: name type? modifiers\n parts = line.split()\n if len(parts) \u003c 2:\n continue\n\n fname = parts[0]\n ftype_raw = parts[1]\n\n is_optional = ftype_raw.endswith('?')\n is_list = ftype_raw.endswith('[]')\n ftype = ftype_raw.rstrip('?').rstrip('[]')\n\n is_id = '@id' in line\n is_unique = '@unique' in line\n is_auto = '@default(autoincrement())' in line or '@default(auto())' in line or '@default(uuid())' in line or '@default(cuid())' in line\n is_relation = '@relation' in line\n\n default_match = re.search(r'@default\\((.+?)\\)', line)\n default_val = default_match.group(1) if default_match else None\n\n fields.append(PrismaField(\n name=fname,\n type=ftype,\n is_optional=is_optional,\n is_list=is_list,\n is_id=is_id,\n is_unique=is_unique,\n is_auto=is_auto,\n default=default_val,\n relation=ftype if is_relation and ftype[0].isupper() else None,\n enum_values=enums.get(ftype, []),\n ))\n\n models[model_name] = fields\n\n return {'models': models, 'enums': enums}","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"SQL DDL Parser","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"def parse_sql_ddl(sql_text: str) -> dict:\n \"\"\"Parse CREATE TABLE statements.\"\"\"\n models = {}\n\n for table_match in re.finditer(\n r'CREATE\\s+TABLE\\s+(?:IF\\s+NOT\\s+EXISTS\\s+)?[`\"]?(\\w+)[`\"]?\\s*\\(([^;]+)\\)',\n sql_text, re.IGNORECASE | re.DOTALL\n ):\n table_name = table_match.group(1)\n columns_text = table_match.group(2)\n fields = []\n\n for col_line in columns_text.split(','):\n col_line = col_line.strip()\n if not col_line or col_line.upper().startswith(('PRIMARY', 'FOREIGN', 'UNIQUE', 'INDEX', 'KEY', 'CONSTRAINT')):\n continue\n\n col_match = re.match(r'[`\"]?(\\w+)[`\"]?\\s+(\\w+)(\\(\\d+\\))?(.*)

Test Data Factory Writing test setup is slower than writing the test itself. You have a model with 12 fields, a model that requires a User, and an model that requires both. Every test file re-invents the same boilerplate — with slightly different hardcoded values that don't cover edge cases. Paste your schema and get a complete factory module: realistic Faker-powered defaults for every field, relationship-aware ordering, and one-line overrides for specific test scenarios. Reads any schema format. Outputs TypeScript, Python, or SQL. Zero external APIs. --- Trigger Phrases - "generate test data…

, col_line, re.IGNORECASE)\n if not col_match:\n continue\n\n fname = col_match.group(1)\n ftype = col_match.group(2).upper()\n rest = col_match.group(4).upper()\n is_nullable = 'NOT NULL' not in rest\n is_auto = 'AUTO_INCREMENT' in rest or 'SERIAL' in ftype\n is_unique = 'UNIQUE' in rest\n\n fields.append(PrismaField(\n name=fname,\n type=ftype,\n is_optional=is_nullable,\n is_auto=is_auto,\n is_unique=is_unique,\n ))\n\n models[table_name] = fields\n\n return {'models': models, 'enums': {}}","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Step 2: Map Types to Faker Functions","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"# Prisma/TypeScript type → Faker.js function\nFAKER_JS_MAP = {\n # Primitives\n 'String': 'faker.lorem.words(3)',\n 'Int': 'faker.number.int({ min: 1, max: 10000 })',\n 'Float': 'faker.number.float({ min: 0, max: 1000, fractionDigits: 2 })',\n 'Boolean': 'faker.datatype.boolean()',\n 'DateTime': 'faker.date.recent({ days: 30 })',\n 'BigInt': 'BigInt(faker.number.int({ min: 1, max: 1000000 }))',\n 'Json': '{}',\n 'Bytes': 'Buffer.from(faker.string.alphanumeric(16))',\n\n # Semantic overrides (based on field name)\n 'email': 'faker.internet.email()',\n 'name': 'faker.person.fullName()',\n 'firstName': 'faker.person.firstName()',\n 'lastName': 'faker.person.lastName()',\n 'username': 'faker.internet.username()',\n 'password': 'faker.internet.password({ length: 12 })',\n 'phone': 'faker.phone.number()',\n 'address': 'faker.location.streetAddress()',\n 'city': 'faker.location.city()',\n 'country': 'faker.location.country()',\n 'zipCode': 'faker.location.zipCode()',\n 'url': 'faker.internet.url()',\n 'imageUrl': 'faker.image.url()',\n 'avatar': 'faker.image.avatar()',\n 'bio': 'faker.lorem.paragraph()',\n 'description': 'faker.lorem.sentences(2)',\n 'title': 'faker.lorem.sentence()',\n 'slug': 'faker.helpers.slugify(faker.lorem.words(3))',\n 'color': 'faker.color.human()',\n 'uuid': 'faker.string.uuid()',\n 'ip': 'faker.internet.ip()',\n 'createdAt': 'faker.date.past({ years: 1 })',\n 'updatedAt': 'new Date()',\n 'deletedAt': 'null',\n 'publishedAt': 'faker.date.recent({ days: 90 })',\n 'price': 'faker.number.float({ min: 0.99, max: 999.99, fractionDigits: 2 })',\n 'amount': 'faker.number.int({ min: 1, max: 10000 })',\n 'quantity': 'faker.number.int({ min: 1, max: 100 })',\n 'score': 'faker.number.float({ min: 0, max: 5, fractionDigits: 1 })',\n 'rating': 'faker.number.int({ min: 1, max: 5 })',\n 'status': None, # replaced by enum values\n 'role': None, # replaced by enum values\n 'type': None, # replaced by enum values\n}\n\n# Same mapping for Python Faker\nFAKER_PY_MAP = {\n 'String': \"fake.sentence(nb_words=3)\",\n 'str': \"fake.sentence(nb_words=3)\",\n 'Int': \"fake.random_int(min=1, max=10000)\",\n 'int': \"fake.random_int(min=1, max=10000)\",\n 'Float': \"round(random.uniform(0, 1000), 2)\",\n 'float': \"round(random.uniform(0, 1000), 2)\",\n 'bool': \"fake.boolean()\",\n 'datetime': \"fake.date_time_this_year()\",\n 'email': \"fake.email()\",\n 'name': \"fake.name()\",\n 'phone': \"fake.phone_number()\",\n 'url': \"fake.url()\",\n 'uuid': \"str(uuid.uuid4())\",\n 'price': \"round(random.uniform(0.99, 999.99), 2)\",\n}\n\ndef get_faker_value(field_name: str, field_type: str, enum_values: list, lang: str = 'ts') -> str:\n \"\"\"Get the Faker expression for a field.\"\"\"\n mapper = FAKER_JS_MAP if lang == 'ts' else FAKER_PY_MAP\n prefix = 'faker.' if lang == 'ts' else 'fake.'\n\n # Enum field: pick from enum values\n if enum_values:\n if lang == 'ts':\n return f'faker.helpers.arrayElement([{\", \".join(repr(v) for v in enum_values)}])'\n else:\n return f'random.choice([{\", \".join(repr(v) for v in enum_values)}])'\n\n # Check semantic field name first\n for semantic_key, expr in mapper.items():\n if field_name.lower().endswith(semantic_key.lower()) or field_name.lower() == semantic_key.lower():\n if expr:\n return expr\n\n # Fall back to type mapping\n return mapper.get(field_type, f'\"TODO: {field_type}\"')","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Step 3: Detect Relationship Order","type":"text"}]},{"type":"paragraph","content":[{"text":"Topologically sort models so parents are created before children:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"def topological_sort(models: dict) -> list[str]:\n \"\"\"Return model names in dependency order (parents first).\"\"\"\n from collections import defaultdict, deque\n\n graph = defaultdict(list)\n in_degree = {name: 0 for name in models}\n\n for model_name, fields in models.items():\n for field in fields:\n if field.relation and field.relation in models and not field.is_optional:\n # model_name depends on field.relation\n graph[field.relation].append(model_name)\n in_degree[model_name] += 1\n\n queue = deque([m for m, d in in_degree.items() if d == 0])\n order = []\n\n while queue:\n model = queue.popleft()\n order.append(model)\n for dependent in graph[model]:\n in_degree[dependent] -= 1\n if in_degree[dependent] == 0:\n queue.append(dependent)\n\n # Append any remaining (circular deps)\n for m in models:\n if m not in order:\n order.append(m)\n\n return order","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Step 4: Generate Factory Code","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"TypeScript Output (Faker.js)","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"typescript"},"content":[{"text":"// Generated by phy-test-data-factory\n// Install: npm install -D @faker-js/faker\n\nimport { faker } from '@faker-js/faker';\nimport { PrismaClient, UserRole, PostStatus } from '@prisma/client';\n\nconst prisma = new PrismaClient();\n\n// ─── User Factory ────────────────────────────────────────────────────────────\n\nexport interface CreateUserOptions {\n id?: string;\n email?: string;\n name?: string;\n role?: UserRole;\n createdAt?: Date;\n}\n\nexport function buildUser(overrides: CreateUserOptions = {}) {\n return {\n id: faker.string.uuid(),\n email: faker.internet.email(),\n name: faker.person.fullName(),\n username: faker.internet.username(),\n password: faker.internet.password({ length: 12 }),\n bio: faker.lorem.paragraph(),\n avatarUrl: faker.image.avatar(),\n role: faker.helpers.arrayElement(['USER', 'ADMIN', 'MODERATOR'] as UserRole[]),\n isActive: true,\n createdAt: faker.date.past({ years: 1 }),\n updatedAt: new Date(),\n ...overrides,\n };\n}\n\nexport async function createUser(overrides: CreateUserOptions = {}) {\n return prisma.user.create({ data: buildUser(overrides) });\n}\n\n// ─── Post Factory ─────────────────────────────────────────────────────────────\n\nexport interface CreatePostOptions {\n id?: string;\n title?: string;\n content?: string;\n authorId?: string; // Will create a User if not provided\n status?: PostStatus;\n}\n\nexport async function createPost(overrides: CreatePostOptions = {}) {\n const authorId = overrides.authorId ?? (await createUser()).id;\n return prisma.post.create({\n data: {\n id: faker.string.uuid(),\n title: faker.lorem.sentence(),\n slug: faker.helpers.slugify(faker.lorem.words(4)),\n content: faker.lorem.paragraphs(3),\n excerpt: faker.lorem.sentences(2),\n status: faker.helpers.arrayElement(['DRAFT', 'PUBLISHED', 'ARCHIVED'] as PostStatus[]),\n publishedAt: faker.date.recent({ days: 90 }),\n authorId,\n createdAt: faker.date.past({ years: 1 }),\n updatedAt: new Date(),\n ...overrides,\n },\n });\n}\n\n// ─── Order Factory ────────────────────────────────────────────────────────────\n\nexport interface CreateOrderOptions {\n id?: string;\n userId?: string;\n total?: number;\n status?: 'PENDING' | 'CONFIRMED' | 'SHIPPED' | 'DELIVERED' | 'CANCELLED';\n}\n\nexport async function createOrder(overrides: CreateOrderOptions = {}) {\n const userId = overrides.userId ?? (await createUser()).id;\n return prisma.order.create({\n data: {\n id: faker.string.uuid(),\n userId,\n total: faker.number.float({ min: 9.99, max: 999.99, fractionDigits: 2 }),\n status: faker.helpers.arrayElement(['PENDING', 'CONFIRMED', 'SHIPPED', 'DELIVERED', 'CANCELLED']),\n address: faker.location.streetAddress(),\n city: faker.location.city(),\n country: faker.location.country(),\n createdAt: faker.date.past({ years: 1 }),\n updatedAt: new Date(),\n ...overrides,\n },\n });\n}\n\n// ─── Bulk creation helpers ────────────────────────────────────────────────────\n\nexport async function createUsers(count: number, overrides: CreateUserOptions = {}) {\n return Promise.all(Array.from({ length: count }, () => createUser(overrides)));\n}\n\nexport async function createPosts(count: number, overrides: CreatePostOptions = {}) {\n return Promise.all(Array.from({ length: count }, () => createPost(overrides)));\n}\n\n// ─── Teardown ─────────────────────────────────────────────────────────────────\n\nexport async function clearTestData() {\n // Delete in reverse dependency order (children before parents)\n await prisma.order.deleteMany();\n await prisma.post.deleteMany();\n await prisma.user.deleteMany();\n}","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Python Output (factory_boy)","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"# Generated by phy-test-data-factory\n# Install: pip install factory_boy faker\n\nimport uuid, random\nfrom datetime import datetime\nimport factory\nfrom factory import Faker, SubFactory, LazyFunction\nfrom factory.django import DjangoModelFactory # or SQLAlchemyModelFactory\nfrom myapp.models import User, Post, Order, UserRole, PostStatus\n\nclass UserFactory(DjangoModelFactory):\n class Meta:\n model = User\n\n id = LazyFunction(lambda: str(uuid.uuid4()))\n email = Faker('email')\n name = Faker('name')\n username = Faker('user_name')\n bio = Faker('paragraph')\n avatar_url = Faker('image_url')\n role = factory.Iterator([r.value for r in UserRole])\n is_active = True\n created_at = Faker('date_time_this_year')\n updated_at = LazyFunction(datetime.utcnow)\n\nclass PostFactory(DjangoModelFactory):\n class Meta:\n model = Post\n\n id = LazyFunction(lambda: str(uuid.uuid4()))\n title = Faker('sentence', nb_words=6)\n slug = factory.LazyAttribute(lambda o: o.title.lower().replace(' ', '-').replace(',', ''))\n content = Faker('paragraphs', nb=3, as_list=False)\n excerpt = Faker('sentences', nb=2, as_list=False)\n status = factory.Iterator([s.value for s in PostStatus])\n author = SubFactory(UserFactory)\n published_at = Faker('date_time_this_month')\n created_at = Faker('date_time_this_year')\n updated_at = LazyFunction(datetime.utcnow)\n\nclass OrderFactory(DjangoModelFactory):\n class Meta:\n model = Order\n\n id = LazyFunction(lambda: str(uuid.uuid4()))\n user = SubFactory(UserFactory)\n total = LazyFunction(lambda: round(random.uniform(9.99, 999.99), 2))\n status = factory.Iterator(['PENDING', 'CONFIRMED', 'SHIPPED', 'DELIVERED'])\n address = Faker('street_address')\n city = Faker('city')\n country = Faker('country')\n created_at = Faker('date_time_this_year')\n updated_at = LazyFunction(datetime.utcnow)\n\n\n# Usage in pytest conftest.py:\n#\n# @pytest.fixture\n# def user(db):\n# return UserFactory()\n#\n# @pytest.fixture\n# def post_with_author(db):\n# return PostFactory() # auto-creates a User via SubFactory\n#\n# @pytest.fixture\n# def many_orders(db):\n# return OrderFactory.create_batch(20)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"SQL Seed Output","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"sql"},"content":[{"text":"-- Generated by phy-test-data-factory\n-- Seed data for: users, posts, orders\n-- Insert in dependency order (parents first)\n\n-- Users (10 rows)\nINSERT INTO users (id, email, name, role, is_active, created_at) VALUES\n ('usr_001', '[email protected]', 'Alice Johnson', 'USER', true, '2026-01-15 09:30:00'),\n ('usr_002', '[email protected]', 'Bob Smith', 'ADMIN', true, '2026-01-20 14:00:00'),\n ('usr_003', '[email protected]', 'Carol Williams', 'USER', true, '2026-02-01 11:00:00'),\n -- ... (7 more rows)\n\n-- Posts (20 rows, requires users above)\nINSERT INTO posts (id, title, slug, status, author_id, created_at) VALUES\n ('post_001', 'Getting Started with Testing', 'getting-started-testing', 'PUBLISHED', 'usr_001', '2026-02-10 10:00:00'),\n ('post_002', 'Advanced Patterns in TypeScript', 'advanced-typescript', 'DRAFT', 'usr_002', '2026-02-15 11:30:00'),\n -- ... (18 more rows)","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Step 5: Output Report","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"markdown"},"content":[{"text":"## Test Data Factory — Generated\nSchema: prisma/schema.prisma | Models: User, Post, Comment, Order, Tag\nOutput: src/test/factories/index.ts\n\n---\n\n### Models Processed (dependency order)\n\n| Model | Fields | Relationships | Factory Type |\n|-------|--------|--------------|-------------|\n| User | 14 fields | — (root) | createUser() |\n| Tag | 4 fields | — (root) | createTag() |\n| Post | 11 fields | → User (author) | createPost() |\n| Comment | 8 fields | → User, → Post | createComment() |\n| Order | 9 fields | → User | createOrder() |\n\n---\n\n### Generated Files\n\n- `src/test/factories/index.ts` — all factory functions\n- `src/test/factories/builders.ts` — plain object builders (no DB write)\n- `src/test/setup.ts` — jest/vitest beforeAll/afterAll with clearTestData()\n\n---\n\n### Auto-Detected Semantic Mappings\n\n| Field | Detected As | Faker Function Used |\n|-------|------------|---------------------|\n| `email` | Email address | `faker.internet.email()` |\n| `avatarUrl` | Image URL | `faker.image.avatar()` |\n| `publishedAt` | Recent date | `faker.date.recent({ days: 90 })` |\n| `role` | Enum (USER/ADMIN/MOD) | `faker.helpers.arrayElement([...])` |\n| `slug` | URL slug | `faker.helpers.slugify(faker.lorem.words(3))` |\n| `price` | Currency amount | `faker.number.float({ fractionDigits: 2 })` |\n\n---\n\n### Quick Usage\n\n```typescript\nimport { createUser, createPost, createOrder, clearTestData } from './factories';\n\n// Single record\nconst user = await createUser();\n\n// With overrides\nconst adminUser = await createUser({ role: 'ADMIN', email: '[email protected]' });\n\n// Relationships handled automatically\nconst post = await createPost(); // creates a User internally\nconst post2 = await createPost({ authorId: user.id }); // reuse existing User\n\n// Batch creation\nconst orders = await createOrders(50);\n\n// Teardown\nafterAll(clearTestData);","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":""},"content":[{"text":"\n---\n\n## Edge Case Variants\n\nWith `--edge-cases`, generate additional factory variants for boundary testing:\n\n```typescript\n// Generated edge-case builders for User model:\n\nexport const edgeCaseUsers = {\n withMinLengthFields: () => buildUser({\n email: '[email protected]',\n name: 'A',\n bio: '',\n }),\n withMaxLengthFields: () => buildUser({\n email: 'a'.repeat(243) + '@b.co', // 255 chars total\n name: 'A'.repeat(255),\n bio: 'x'.repeat(5000),\n }),\n withNullableFieldsNull: () => buildUser({\n bio: null,\n avatarUrl: null,\n phoneNumber: null,\n }),\n withSpecialCharacters: () => buildUser({\n name: \"O'Brien-Smith, Jr.\",\n bio: '\u003cscript>alert(\"xss\")\u003c/script>', // for XSS testing\n }),\n withUnicodeContent: () => buildUser({\n name: '张伟',\n bio: '日本語テキスト with emoji 🎉',\n }),\n withPastDates: () => buildUser({\n createdAt: new Date('2000-01-01'),\n }),\n withFutureDates: () => buildUser({\n createdAt: new Date('2099-12-31'),\n }),\n};","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"Install Dependencies","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# TypeScript / JavaScript\nnpm install -D @faker-js/faker\n\n# Python (Django)\npip install factory_boy faker\n\n# Python (SQLAlchemy)\npip install factory_boy faker sqlalchemy\n\n# Verify installation\nnode -e \"const { faker } = require('@faker-js/faker'); console.log(faker.person.fullName())\"\npython3 -c \"import factory; print('factory_boy ready')\"","type":"text"}]},{"type":"hr","attrs":{"markup":"---"}}]},"metadata":{"date":"2026-06-05","name":"phy-test-data-factory","author":"@skillopedia","source":{"stars":2012,"repo_name":"openclaw-master-skills","origin_url":"https://github.com/leoyeai/openclaw-master-skills/blob/HEAD/skills/phy-test-data-factory/SKILL.md","repo_owner":"leoyeai","body_sha256":"3aaf4c52e10fd5163d55ae9171d4fcfa8399e91609735e2e6c0096180d2940c9","cluster_key":"ed46f689782f8ac03a6d6e06eb6f53104651de6684995156565bcdaa1d089757","clean_bundle":{"format":"clean-skill-bundle-v1","source":"leoyeai/openclaw-master-skills/skills/phy-test-data-factory/SKILL.md","attachments":[{"id":"033f4f86-fa41-5c46-9f1d-1902c584d936","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/033f4f86-fa41-5c46-9f1d-1902c584d936/attachment.json","path":"_meta.json","size":294,"sha256":"50e2dceef8bff3162e02f13b8b367f78d762e7f67477129e96fd89ea68303e70","contentType":"application/json; charset=utf-8"}],"bundle_sha256":"de504da3123b40f9a9ce8981bace17dd0a298732e50baf9f7e05c44687a9f9d0","attachment_count":1,"text_attachments":1,"attachment_storage":"skillopedia-attachments-v1","binary_attachments":0,"excluded_attachments":[]},"cluster_size":1,"skill_md_path":"skills/phy-test-data-factory/SKILL.md","import_metadata":{"date":"2026-06-05","author":"@skillopedia","version":"v1","category":"testing-qa","category_label":"Testing"},"exact_dupes_collapsed_into_this":0},"license":"Apache-2.0","version":"v1","category":"testing-qa","metadata":{"tags":["testing","test-data","fixtures","faker","factory-boy","prisma","sqlalchemy","django","seed-data","developer-tools"],"author":"PHY041","version":"1.0.0"},"import_tag":"clean-skills-v1","description":"Schema-driven test data factory generator. Reads your database schema or model definitions — Prisma schema, SQLAlchemy models, Django models, TypeORM entities, Zod schemas, Pydantic models, or raw SQL DDL — and generates ready-to-use factory functions with realistic fake data. Outputs TypeScript factory files using Faker.js, Python conftest.py using factory_boy + Faker, or raw SQL INSERT seed scripts. Respects foreign key relationships (seeds parents before children), handles enums, nullable fields, unique constraints, and generates edge-case variants (empty strings, max-length values, boundary dates). Zero external API — pure local file analysis + code generation. Triggers on \"generate test data\", \"seed database\", \"test fixtures\", \"factory functions\", \"fake data from schema\", \"/test-data-factory\"."}},"renderedAt":1782987370684}

Test Data Factory Writing test setup is slower than writing the test itself. You have a model with 12 fields, a model that requires a User, and an model that requires both. Every test file re-invents the same boilerplate — with slightly different hardcoded values that don't cover edge cases. Paste your schema and get a complete factory module: realistic Faker-powered defaults for every field, relationship-aware ordering, and one-line overrides for specific test scenarios. Reads any schema format. Outputs TypeScript, Python, or SQL. Zero external APIs. --- Trigger Phrases - "generate test data…