Scanning large sites in Enterprise Edition TN-W20
Enterprise Edition scans are limited by the license type:
- 25k licenses can report/map up to 25,000 pages per scan
- 50k licenses can report/map up to 50,000 pages per scan
- 100k licenses can report/map up to 100,000 pages per scan
Memory limits may also come into play:
- Each scan process is limited to 1.5GB of RAM to avoid interfering with other scans or the web app
- Each unique URL used on the site occupies 48 bytes plus the number of bytes in the URL
- Each issue reported uses 48 bytes for each page it’s reported on
- Each line reported for an issue uses 8 bytes
For sitemaps the maximum number of pages that can be reported is 50,000 on a 50k license. One sitemap report (Excel Link Report) lists each link on every page, so this report can become very large if each page has a large number of links (a 300MB CSV file is not uncommon)
For scan reports the maximum number of issues and lines reported is controlled by these settings in Edit Scan:
- Maximum pages listed per issue (default 20)
- Maximum line numbers per issue (default 4)
There are a fixed number of rules (over 1300) for checking accessibility and browser compatibility, but these rules are added to by HTML validation because a rule is created for each unknown element: “Element XYZ not allowed as child element”.
Coding errors like this:
<adata-FFE4='FFED'>
<adata-D5FE='AAC4'>
which has a missing space and should read:
<a data-FFE4='FFED'>
<a data-D5FE='AAC4'>
can result in large amounts of memory used due to a rule being created for each unknown element:
- “Element adata-FFE4 not allowed as child element”
- “Element adata-D5FE not allowed as child element”
This problem is rare, but if it happens turning off HTML validation (Edit Scan->Standards) will greatly reduce the amount of memory used.
Applies To: Enterprise Edition 2016.1 or later
Last Reviewed: October 31, 2017