arelle.WebCache¶
See COPYRIGHT.md for copyright information.
For SEC EDGAR data access see: https://www.sec.gov/os/accessing-edgar-data
e.g., User-Agent: Sample Company Name AdminContact@
Module Contents¶
Classes¶
Functions¶
Data¶
API¶
- arelle.WebCache._: arelle.typing.TypeGetText¶
None
- arelle.WebCache.addServerWebCache¶
None
- arelle.WebCache.DIRECTORY_INDEX_FILE¶
‘!~DirectoryIndex~!’
- arelle.WebCache.FILE_LOCK_TIMEOUT¶
30
- arelle.WebCache.INF¶
‘float(…)’
- arelle.WebCache.RETRIEVAL_RETRY_COUNT¶
5
- arelle.WebCache.HTTP_USER_AGENT¶
‘format(…)’
- arelle.WebCache._XBRL_ORG_URL_PREFIXES¶
‘frozenset(…)’
- arelle.WebCache.XBRL_ORG_CACHE_REDIRECTS¶
None
- class arelle.WebCache.ProxyTuple¶
Bases:
typing.NamedTuple- useOsProxy: bool¶
None
- urlAddr: str | None¶
None
- urlPort: str | None¶
None
- user: str | None¶
None
- password: str | None¶
None
- classmethod coerce(value: Any) arelle.WebCache.ProxyTuple | None¶
- property authority: str¶
- arelle.WebCache.proxyDirFmt(httpProxyTuple: arelle.WebCache.ProxyTuple | None) dict[str, str] | None¶
- arelle.WebCache.proxyTuple(url: str) arelle.WebCache.ProxyTuple¶
- arelle.WebCache.lastModifiedTime(headers: dict[str, str]) float | None¶
- class arelle.WebCache.WebCache(cntlr: arelle.Cntlr.Cntlr, httpProxyTuple: arelle.WebCache.ProxyTuple | None)¶
Initialization
- default_timeout: float | int | None¶
None
- property timeout: float | None¶
- property recheck: str¶
- property logDownloads: bool¶
- saveUrlCheckTimes() None¶
- property noCertificateCheck: bool¶
- property httpUserAgent: str¶
- property httpsRedirect: bool¶
- redirectFallback(matchPattern: regex.Pattern[str], replaceFormat: str) None¶
- resetProxies(httpProxyTuple: arelle.WebCache.ProxyTuple | None) None¶
- property opener: urllib.request.OpenerDirector¶
- normalizeFilepath(filepath: str, url: str, cacheDir: str | None = None) str¶
Perform any necessary transformations to filepath.
- Parameters:
filepath – Filepath to normalize.
url – Original URL (for http/https redirect).
cacheDir – Cache root directory.
- Returns:
Normalized filepath.
- normalizeUrl(url: str | None, base: str | None = None) Any¶
- encodeForFilename(pathpart: str) str¶
- _fallbackRedirect(url: str, originalFilepath: str, cacheDir: str) str¶
If the original URL does not map to an existing cache file, we’ll check each fallback redirect pattern to see if modifying the URL yields a path to a file that does exist in the cache. If none is found, the original filepath is returned.
- Parameters:
url – The requested URL.
originalFilepath – The original mapped filepath.
- Returns:
An existing redirected path or the original filepath.
- urlToCacheFilepath(url: str, cacheDir: str | None = None, useRedirectFallback: bool = True) str¶
Converts
urlinto the corresponding cache filepath in `cacheDir.- Parameters:
url – URL to convert.
cacheDir – Cache root directory.
useRedirectFallback – Whether to use fallback redirects.
- Returns:
Cache filepath.
- cacheFilepathToUrl(cacheFilepath: str, cacheDir: str | None = None) str¶
- getfilename(url: str | None, base: str | None = None, reload: bool = False, checkModifiedTime: bool = False, normalize: bool = False, filenameOnly: bool = False, allowTransformation: bool = True) str | None¶
- _checkIfNewerOnWeb(url: str, filepath: str) bool¶
- Parameters:
url – URL to retrieve web timestamp from
filepath – Filepath to retrieve local timestamp from
- Returns:
- static _getTimeString(timeValue: float) str¶
- Parameters:
timeValue – time in seconds since the epoch, in UTC
- Returns:
UTC-formatted string representation of
timeValue
- static _quotedUrl(url: str) str¶
- Parameters:
url
- Returns:
urlwith scheme-specific-part quoted except for parameter separators
- static _getFileTimestamp(path: str) float¶
- _downloadFileWithLock(url: str, filepath: str, retrievingDueToRecheckInterval: bool = False, retryCount: int = 5) bool¶
- _downloadFile(url: str, filepath: str, retrievingDueToRecheckInterval: bool = False, retryCount: int = 5) bool¶
Downloads the file at
urlto a temporary location before copying it tofilepath.- Parameters:
url – Web resource to download.
filepath – End destination for downloaded file.
retrievingDueToRecheckInterval – Determines how errors are handled when download is part of a cache recheck.
retryCount – Number of times to retry download.
- Returns:
Whether
filepathshould now be used.
- internetRecheckFailedRecovery(url: str, err: str | Exception, timeNowStr: str) None¶
- reportProgress(blockCount: int, blockSize: int, totalSize: int) None¶
- clear() None¶
- getheaders(url: str) dict[str, str]¶
- geturl(url: str) str | None¶
- retrieve(url: str, filename: str | None = None, filestream: io.BytesIO | None = None, reporthook: collections.abc.Callable[[int, int, int], None] | None = None, data: bytes | None = None) tuple[str | None, dict[str, str], bytes]¶