Scientific Deployment & Experiment Design¶

This document describes how to deploy the Agent Registry on a public testnet, run controlled experiments with autonomous agents, collect data, and analyze results for a scientific paper.

Requirements¶

Hardware & Accounts¶

Requirement	Cost	Where to Get
Machine with Node.js 18+ and Python 3.10+	Free (your laptop)	Already have
MetaMask or any Ethereum wallet (for deployer key)	Free	metamask.io
Base Sepolia testnet ETH (for deployment + relayer)	Free	Base Sepolia faucet
A server for relayer + API (Railway, Fly.io, or VPS)	Free tier available	railway.app / fly.io
GitHub repo for reproducibility	Free	github.com

Software Dependencies¶

# Node.js (Hardhat, relayer, API)
node >= 18.0.0
npm >= 9.0.0

# Python (SDK, test harness, data analysis)
python >= 3.10
pip install web3 requests python-dotenv eth-account pandas matplotlib

# Hardhat toolchain
npm install --save-dev hardhat @nomicfoundation/hardhat-toolbox

Testnet Setup¶

Why Base Sepolia?

Base Sepolia is the recommended testnet because:

Free ETH from faucets (no cost)
Same EVM as mainnet (identical contract behavior)
Block times ~2s (fast enough for experiments)
Block explorer available (Sepolia Basescan) for public verification
Transactions are permanent and publicly verifiable -- reviewers can independently verify all experimental data

Getting testnet ETH:

Go to the Coinbase Base Sepolia faucet
Or use QuickNode faucet
Request 0.1 ETH (more than enough for thousands of test transactions)
You need two wallets funded:
- Deployer wallet -- deploys contracts (~0.01 ETH)
- Relayer wallet -- pays gas for gasless registrations (~0.05 ETH for hundreds of tests)

Research Questions¶

The experiment addresses the following research questions:

#	Research Question	Measured By
RQ1	Can gasless registration achieve a 100% success rate?	Registration success rate, latency, gas costs
RQ2	What are real-world gas costs for each operation?	Gas per operation, relayer costs, infrastructure costs
RQ3	Can the 7-day attestation cycle maintain compliance across 100 agents?	Compliance rate, time-to-detection of lapsed attestation
RQ4	Does multi-generation lineage (depth 3+) work reliably?	Lineage tree correctness, generation depth limits
RQ5	What is the registration throughput limit?	Registration latency, throughput under load
RQ6	Can the KYA model detect non-compliant agents accurately?	KYA query accuracy, false positive/negative rates

Experiment Scenarios¶

Scenario A: Batch Registration¶

Register 100 agents in rapid succession with varied capabilities and operational scopes.

Design

Mix of root agents and multi-generation children (up to depth 5)
Measure: registration latency, gas per registration, throughput
Vary: gasless vs. direct mode, batch size, concurrent registrations
Agent categories: WebCrawler, ContentCreator, TradingBot, CustomerService, DevOps, ResearchAgent, DataProcessor, Orchestrator, GenerativeAI, SecurityAudit

Scenario B: Multi-Generation Lineage¶

Register a lineage tree with root agents, children, and grandchildren reaching depth 3.

Design

Root agents each spawn 1-3 children
Children spawn grandchildren (generation 2)
Select chains extend to great-grandchildren (generation 3)
Measure: lineage tree correctness, generation depth enforcement, parent-child relationship integrity
Verify: API returns recursive tree structure accurately

Scenario C: Compliance Lifecycle¶

Test the attestation and compliance lifecycle across the agent population.

Design

Register agents, all attesting initially
Allow a subset to lapse their attestation
Query KYA for all agents before and after lapse
Measure: compliance detection accuracy, time-to-detection, compliance rate over time
Verify: lapsed agents correctly flagged as non-compliant

Scenario D: Revenue Reporting¶

Test revenue reporting across varied currencies and categories.

Design

Agents report revenue in multiple currencies (USDC, EUR, ETH)
Revenue categories: compute_services, service_fees, data_sales, consulting
Revenue tiers: zero ($0), low ($1-$50), medium ($50-$500), high ($500-$5,000)
Measure: data integrity, query performance, aggregate accuracy

Scenario E: Regulatory Actions¶

Test the full regulatory action lifecycle: suspend, revoke, reactivate.

Design

Register agents, add regulators
Suspend subset of agents -- verify status = Suspended, isCompliant = false
Reactivate subset -- verify status = Active, isCompliant = true
Revoke subset -- verify permanent state change
Remove regulators (cleanup)
Measure: state transition correctness, event completeness, wallet reuse after revocation

Scenario F: KYA Validation¶

Test KYA (Know Your Agent) accuracy with true positives and true negatives.

Design

Query registered, compliant agents (expected: true positive)
Query registered but lapsed agents (expected: registered but not compliant)
Query suspended/revoked agents (expected: not compliant)
Query unregistered wallets (expected: not registered)
Measure: accuracy, false positive rate, false negative rate

Data Collection Metrics¶

On-Chain Data (Primary Source)¶

All data is permanently stored on-chain and independently verifiable:

Data Point	Contract Function	Paper Metric
Registration events	`AgentRegistered` event logs	Registration count, rate, gas cost
Compliance attestations	`ComplianceAttested` event logs	Attestation frequency, lapse rate
Revenue reports	`RevenueReported` event logs	Economic activity volume
Status changes	`AgentSuspended/Revoked/Reactivated` events	Regulatory action frequency
Capability updates	`CapabilityUpdated` events	Self-modification frequency
Child spawning	`ChildSpawned` events	Replication rate, lineage depth

Off-Chain Metrics (Supplementary)¶

Metric	Source	Purpose
Registration latency	Test harness timestamps	UX and throughput analysis
KYA query latency	API response times	Enforcement feasibility
Relayer gas consumption	Relayer status endpoint	Cost model validation
Cache hit rates	API server logs	Scalability assessment
Rate limit triggers	Relayer logs	Security analysis

Deployment Steps¶

Step 1: Clone and Install¶

cd agentenregister
npm install

Step 2: Configure Environment¶

cp .env.example .env
# Edit .env:
# DEPLOYER_KEY=0x<your-deployer-private-key>
# RELAYER_KEY=0x<your-relayer-private-key>

Step 3: Run Tests Locally¶

npx hardhat test

Tip

All tests should pass before deployment. This validates the contracts.

Step 4: Deploy to Base Sepolia¶

npx hardhat run scripts/deploy.js --network baseSepolia

Output will include:

REGISTRY_ADDRESS=0x...
FORWARDER_ADDRESS=0x...

Add these to your .env file.

Step 5: Start Services¶

# Terminal 1 (Relayer)
node relayer/relayer.js

# Terminal 2 (API)
node api/server.js

Step 6: Verify Deployment¶

curl https://api.theagentregistry.org/health
curl https://relay.theagentregistry.org/status
curl https://api.theagentregistry.org/api/v1/stats

Reproducibility Checklist¶

For the paper to be scientifically credible, ensure:

Cost Summary¶

Item	Cost
Base Sepolia testnet ETH	Free (faucet)
Contract deployment gas	Free (testnet)
100 agent registrations	Free (testnet)
Server for relayer + API (Railway free tier)	$0
Domain for API endpoint (optional)	~$10/year
Total cost for scientific experiment	$0 - $10

Zero-cost science

The entire scientific experiment can be run for effectively zero cost on testnet. Mainnet deployment (for production) would cost approximately $5-10 for contract deployment and ~$50/month for relayer operation at the 1,000-agent scale.

Timeline¶

Week	Activity
1	Fix bugs, expand tests, deploy to testnet
2	Run Scenarios A, B, C (registration, lineage, compliance)
3	Run Scenarios D, E, F (revenue, regulatory, KYA)
4	Data analysis, chart generation
5-6	Paper writing (system design, evaluation, discussion)
7	Internal review, revisions
8	Submission

Total: approximately 8 weeks from deployment to submission.