QueryHtml Class
Queries HTML content of the input documents and creates new documents with content and metadata from the results.
Namespace
Statiq.Core
Interfaces
Base Types
graph BT Type-->Base0["ParallelModule"] click Base0 "/api/Statiq.Common/ParallelModule" Base0-->Base1["Module"] click Base1 "/api/Statiq.Common/Module" Base1-->Base2["object"] Type-.->Interface0["IModule"] click Interface0 "/api/Statiq.Common/IModule" Type-.->Interface1["IParallelModule"] click Interface1 "/api/Statiq.Common/IParallelModule" Type["QueryHtml"] class Type type-node

Syntax

public class QueryHtml : ParallelModule, IModule, IParallelModule

Remarks

Once you provide a DOM query selector, the module creates new output documents for each query result and allows you to set the new document content and/or set new metadata based on the query result.

Note that because this module parses the document content as standards-compliant HTML and outputs the formatted post-parsed DOM, you should only place this module after all other template processing has been performed.

Constructors

Name Summary
QueryHtml(string) Creates the module with the specified query selector.

Properties

Name Property Type Summary
Parallel bool
Indicates whether documents will be processed by this module in parallel.
Inherited from ParallelModule

Methods

Name Return Value Summary
AfterExecution(IExecutionContext, ExecutionOutputs) void
Called after each module execution.
Inherited from Module
AfterExecutionAsync(IExecutionContext, ExecutionOutputs) Task
Called after each module execution.
Inherited from Module
BeforeExecution(IExecutionContext) void
Called before each module execution.
Inherited from Module
BeforeExecutionAsync(IExecutionContext) Task
Called before each module execution.
Inherited from Module
ExecuteAsync(IExecutionContext) Task<IEnumerable<IDocument>>
This should not be called directly, instead call IExecutionContext.Execute() if you need to execute a module from within another module.
Inherited from Module
ExecuteContextAsync(IExecutionContext) Task<IEnumerable<IDocument>>
Executes the module once for all input documents.
Inherited from ParallelModule
ExecuteInputAsync(IDocument, IExecutionContext) Task<IEnumerable<IDocument>>
Finally(IExecutionContext) void
Called after each module execution, even if an exception is thrown during execution.
Inherited from Module
FinallyAsync(IExecutionContext) Task
Called after each module execution, even if an exception is thrown during execution.
Inherited from Module
First(bool) QueryHtml
Specifies that only the first query result should be processed (the default is false).
GetAll() QueryHtml
Gets all information for each query result and sets the metadata of the corresponding result document(s). This is equivalent to calling GetOuterHtml(), GetInnerHtml(), GetTextContent(), and GetAttributeValues() with default arguments.
GetAttributeValue(string, string) QueryHtml
Gets the specified attribute value of each query result and sets it in the metadata of the corresponding result document(s). If the attribute is not found for a given query result, no metadata is set. If metadataKey is null, the attribute name will be used as the metadata key, otherwise the specified metadata key will be used.
GetAttributeValues() QueryHtml
Gets the values for all attributes of each query result and sets them in the metadata of the corresponding result document(s) with keys names equal to the attribute local name.
GetInnerHtml(string) QueryHtml
Gets the inner HTML of each query result and sets it in the metadata of the corresponding result document(s) with the specified key.
GetOuterHtml(string) QueryHtml
Gets the outer HTML of each query result and sets it in the metadata of the corresponding result document(s) with the specified key.
GetTextContent(string) QueryHtml
Gets the text content of each query result and sets it in the metadata of the corresponding result document(s) with the specified key.
SetContent(bool?) QueryHtml
Sets the content of the result document(s) to the content of the corresponding query result, optionally specifying whether inner or outer HTML content should be used. The default is null, which does not add any content to the result documents (only metadata).

Extension Methods