New: Subscription plans now available for CRO & pharma labs — Starter from $299/mo →
No CAS number. No entry in any database. No SDS — until now. ChemEngine builds a proxy SDS for your proprietary research chemical using read-across from structurally similar compounds. ECHA-documented. Audit-ready. Delivered in minutes.
PROXY SAFETY DATA SHEET
ChemEngine Datatools — R&D Edition
Rev. 1 · 2026
Generated by ChemEngine Datatools
R&D PROXY · Page 1 of 8
Built for the researchers who work before the database exists
The Problem
Your novel research compound — synthesized in-house, under NDA — doesn't appear in PubChem, ChemSpider, or ECHA CHEM. Standard SDS generators return nothing.
Commissioning a toxicologist to write a custom SDS for a pre-candidate compound costs $2,000–$8,000 and takes 3–6 weeks. Most discovery compounds never make it that far.
OSHA HazCom and internal safety protocols require SDS documentation before a chemical can be handled. No SDS means no lab work.
The Solution
“Your safety officer needs documentation before the compound enters the lab. Your CAS number doesn't exist in any database. ChemEngine builds a proxy SDS using read-across from structurally similar compounds — ECHA-documented, audit-ready.”
We find the closest known analog in ECHA's database using Tanimoto coefficient scoring. Only high-confidence matches above 0.85 are used.
Hazard data is carried across from the reference compound using ECHA's documented Read-Across Assessment Framework (RAAF). Conservative classifications are applied.
Every section discloses the proxy source, similarity score, and confidence level. Your safety officer sees exactly what was estimated vs. confirmed.
Ready in under 10 minutes — not 6 weeks.
Submit your SMILES, InChI, or compound name. We return a complete 16-section proxy SDS as an editable .docx with your lab's name on it.
How It Works
Paste a SMILES string, InChI/InChIKey, partial CAS, compound name, or molecular formula.
We search ECHA's database for the closest structural analog with Tanimoto ≥ 0.85 and verified safety data.
Hazard, physical, and toxicological data is carried across with ECHA RAAF methodology. Conservative sections applied.
Your 16-section proxy SDS arrives as a white-labeled editable .docx with full confidence disclosure and audit trail.
Accepted input formats:
Pricing
One-off or monthly. Pay per compound or subscribe for your whole discovery pipeline.
Pay per SDS
No subscription needed — generate a single proxy SDS
$49
per sheet
Starter
10 R&D SDS per month
≈ $29.90 per sheet
No setup fee · Cancel anytime
Pro
50 R&D SDS per month
≈ $19.98 per sheet
No setup fee · Cancel anytime
Enterprise
Unlimited SDS + API access
Unlimited throughput
No setup fee · Cancel anytime
| Feature | Starter | Pro | Enterprise |
|---|---|---|---|
| Monthly SDS quota | 10 | 50 | Unlimited |
| SMILES / InChI / name input | ✓ | ✓ | ✓ |
| Tanimoto proxy matching ≥0.85 | ✓ | ✓ | ✓ |
| 16-section GHS/OSHA SDS | ✓ | ✓ | ✓ |
| Confidence disclosure per section | ✓ | ✓ | ✓ |
| Editable .docx + PDF | ✓ | ✓ | ✓ |
| Bulk CSV batch upload | — | ✓ | ✓ |
| Project folders | — | ✓ | ✓ |
| Expert proxy override | — | ✓ | ✓ |
| Team seats | 1 | 2 | Unlimited |
| REST API access | — | — | ✓ |
| JSON confidence metadata | — | — | ✓ |
| ELN integration readiness | — | — | ✓ |
| Internal approval workflow | — | — | ✓ |
| Dedicated support | — | — | ✓ |
Compliance & Trust
Every proxy SDS is generated against documented ECHA methodology and discloses its data provenance transparently.
Proxy SDS backed by Tanimoto similarity ≥0.85 to confirmed compounds
We only generate proxy SDS when we can find a structurally similar reference compound with a Tanimoto score at or above 0.85 in ECHA's chemical database — the threshold accepted under ECHA's Read-Across Assessment Framework (RAAF) for data bridging. If no qualifying analog exists, we tell you — no document is generated.
GHS Rev. 9
UN Globally Harmonized
ECHA RAAF
Read-Across Framework
OSHA HazCom
29 CFR 1910.1200
EU REACH
Annex II SDS
EU CLP
Regulation 1272/2008
GLP Ready
Audit trail included
R&D Use Disclaimer: Proxy SDS documents are generated for internal research and development use only. They are not suitable for commercial product distribution, REACH registration submissions, or transport of materials for commercial sale. Appropriate laboratory safety controls should be maintained regardless of proxy SDS classification.
FAQ
A proxy SDS (also called a read-across SDS) is a Safety Data Sheet generated for a novel compound by borrowing safety data from a structurally similar, well-documented compound. Rather than leaving your lab with no documentation at all, it provides a conservative, science-based estimate of hazard properties. ChemEngine uses ECHA's Read-Across Assessment Framework (RAAF) methodology and only bridges data from compounds with Tanimoto similarity ≥0.85 — the accepted threshold for high-confidence structural analogy. Every section that uses estimated data is clearly labeled.
Yes — within scope. For internal R&D and laboratory handling purposes, a proxy SDS generated via documented read-across methodology satisfies OSHA's HazCom requirement that employees have access to chemical hazard information. The key conditions: (1) it must be clearly labeled as a proxy/estimated document, (2) the methodology must be documented (ours uses ECHA RAAF, disclosed per section), and (3) it is not suitable for commercial distribution under GHS/HazCom regulations, REACH registration, or transport of materials for commercial sale. Your EHS team should review prior to use as official lab documentation. We include a mandatory 3-step acknowledgment in the generation flow that captures these limitations.
ChemEngine accepts five input types for novel compounds: (1) SMILES string — preferred, provides the best Tanimoto scoring accuracy; (2) InChI or InChIKey — converted to SMILES server-side via RDKit; (3) Partial or non-standard CAS — used as fallback text search in PubChem; (4) Compound name or internal designation (e.g., 'Compound 23B', 'NVX-301') — triggers PubChem fuzzy name search; (5) Molecular formula (e.g., C₁₄H₁₉N₃O₂) — lowest precision, generates multiple proxy candidates for you to select from. SMILES is strongly recommended for unambiguous compound identification and highest proxy match quality.
Don't let missing documentation block your discovery pipeline. One-off sheets from $49, or subscribe from $299/month for your whole CRO or pharma lab.
One-off $49 · Starter $299/mo · Pro $999/mo · Enterprise $2,499/mo · No setup fee