Class AbstractProductProvider

java.lang.Object
io.github.dinethdilhara.urltoproduct.provider.AbstractProductProvider
All Implemented Interfaces:
ProductProvider
Direct Known Subclasses:
AliExpressProvider, AmazonProvider, GenericProvider

public abstract class AbstractProductProvider extends Object implements ProductProvider
Base implementation for product providers.

This class contains shared scraping logic used by all concrete providers (e.g. Amazon, AliExpress, Generic).

It provides common utilities such as:

  • HTML fetching using Jsoup
  • Selector-based extraction helpers
  • Price parsing and normalization
  • Image URL cleaning

Concrete providers must implement site-specific logic such as:

Version:
1.0.0
Author:
Dineth Dilhara
  • Constructor Details

    • AbstractProductProvider

      public AbstractProductProvider()
  • Method Details

    • supports

      public boolean supports(String url)
      Description copied from interface: ProductProvider
      Checks whether this provider can handle the given URL.
      Specified by:
      supports in interface ProductProvider
      Parameters:
      url - product page URL
      Returns:
      true if provider supports the URL, false otherwise
    • extract

      public ProductDetails extract(String url)
      Extracts product information from the given URL.

      Fetches HTML, applies provider-specific parsing logic, and returns normalized product data.

      Specified by:
      extract in interface ProductProvider
      Parameters:
      url - product page URL
      Returns:
      extracted ProductDetails
      Throws:
      ProviderExtractionException - if scraping fails
    • matchesHost

      protected abstract boolean matchesHost(String host)
    • providerName

      protected abstract String providerName()
    • extractTitle

      protected abstract String extractTitle(org.jsoup.nodes.Document doc)
    • extractDescription

      protected abstract String extractDescription(org.jsoup.nodes.Document doc)
    • extractPrice

      protected abstract BigDecimal extractPrice(org.jsoup.nodes.Document doc)
    • extractImages

      protected abstract ArrayList<String> extractImages(org.jsoup.nodes.Document doc)
    • firstNonBlank

      protected String firstNonBlank(String... values)
    • normalizeWhitespace

      protected String normalizeWhitespace(String value)
    • normalizeImageUrl

      protected String normalizeImageUrl(String url)
    • parsePrice

      protected BigDecimal parsePrice(String rawPrice)
      Converts raw price text into a normalized BigDecimal value.
      Parameters:
      rawPrice - raw price string from HTML
      Returns:
      parsed price or null if invalid
    • extractBySelectors

      protected String extractBySelectors(org.jsoup.nodes.Document doc, String[] selectors)
    • extractPriceBySelectors

      protected BigDecimal extractPriceBySelectors(org.jsoup.nodes.Document doc, String[] selectors)
    • extractImagesBySelectors

      protected ArrayList<String> extractImagesBySelectors(org.jsoup.nodes.Document doc, String[] selectors)
    • connectTo

      protected org.jsoup.nodes.Document connectTo(String url) throws Exception
      Creates a Jsoup connection for the given URL.
      Parameters:
      url - target URL
      Returns:
      parsed HTML document
      Throws:
      Exception - if connection fails