Class AbstractProductProvider
java.lang.Object
io.github.dinethdilhara.urltoproduct.provider.AbstractProductProvider
- All Implemented Interfaces:
ProductProvider
- Direct Known Subclasses:
AliExpressProvider,AmazonProvider,GenericProvider
Base implementation for product providers.
This class contains shared scraping logic used by all concrete providers (e.g. Amazon, AliExpress, Generic).
It provides common utilities such as:
- HTML fetching using Jsoup
- Selector-based extraction helpers
- Price parsing and normalization
- Image URL cleaning
Concrete providers must implement site-specific logic such as:
- Version:
- 1.0.0
- Author:
- Dineth Dilhara
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected org.jsoup.nodes.DocumentCreates a Jsoup connection for the given URL.Extracts product information from the given URL.protected StringextractBySelectors(org.jsoup.nodes.Document doc, String[] selectors) protected abstract StringextractDescription(org.jsoup.nodes.Document doc) extractImages(org.jsoup.nodes.Document doc) extractImagesBySelectors(org.jsoup.nodes.Document doc, String[] selectors) protected abstract BigDecimalextractPrice(org.jsoup.nodes.Document doc) protected BigDecimalextractPriceBySelectors(org.jsoup.nodes.Document doc, String[] selectors) protected abstract StringextractTitle(org.jsoup.nodes.Document doc) protected StringfirstNonBlank(String... values) protected abstract booleanmatchesHost(String host) protected StringnormalizeImageUrl(String url) protected StringnormalizeWhitespace(String value) protected BigDecimalparsePrice(String rawPrice) Converts raw price text into a normalized BigDecimal value.protected abstract StringbooleanChecks whether this provider can handle the given URL.
-
Constructor Details
-
AbstractProductProvider
public AbstractProductProvider()
-
-
Method Details
-
supports
Description copied from interface:ProductProviderChecks whether this provider can handle the given URL.- Specified by:
supportsin interfaceProductProvider- Parameters:
url- product page URL- Returns:
- true if provider supports the URL, false otherwise
-
extract
Extracts product information from the given URL.Fetches HTML, applies provider-specific parsing logic, and returns normalized product data.
- Specified by:
extractin interfaceProductProvider- Parameters:
url- product page URL- Returns:
- extracted
ProductDetails - Throws:
ProviderExtractionException- if scraping fails
-
matchesHost
-
providerName
-
extractTitle
-
extractDescription
-
extractPrice
-
extractImages
-
firstNonBlank
-
normalizeWhitespace
-
normalizeImageUrl
-
parsePrice
Converts raw price text into a normalized BigDecimal value.- Parameters:
rawPrice- raw price string from HTML- Returns:
- parsed price or null if invalid
-
extractBySelectors
-
extractPriceBySelectors
-
extractImagesBySelectors
-
connectTo
-