Class Axiomatic

  • Direct Known Subclasses:
    AxiomaticF1EXP, AxiomaticF1LOG, AxiomaticF2EXP, AxiomaticF2LOG, AxiomaticF3EXP, AxiomaticF3LOG

    public abstract class Axiomatic
    extends SimilarityBase
    Axiomatic approaches for IR. From Hui Fang and Chengxiang Zhai 2005. An Exploration of Axiomatic Approaches to Information Retrieval. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '05). ACM, New York, NY, USA, 480-487.

    There are a family of models. All of them are based on BM25, Pivoted Document Length Normalization and Language model with Dirichlet prior. Some components (e.g. Term Frequency, Inverted Document Frequency) in the original models are modified so that they follow some axiomatic constraints.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected float k
      hyperparam for the primitive weighthing function
      protected int queryLen
      the query length
      protected float s
      hyperparam for the growth function
    • Constructor Summary

      Constructors 
      Constructor Description
      Axiomatic()
      Default constructor
      Axiomatic​(float s)
      Constructor setting only s, letting k and queryLen to default
      Axiomatic​(float s, int queryLen)
      Constructor setting s and queryLen, letting k to default
      Axiomatic​(float s, int queryLen, float k)
      Constructor setting all Axiomatic hyperparameters
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected void explain​(java.util.List<Explanation> subs, BasicStats stats, double freq, double docLen)
      Subclasses should implement this method to explain the score.
      protected Explanation explain​(BasicStats stats, Explanation freq, double docLen)
      Explains the score.
      protected abstract double gamma​(BasicStats stats, double freq, double docLen)
      compute the gamma component (only for F3EXp and F3LOG)
      protected abstract double idf​(BasicStats stats, double freq, double docLen)
      compute the inverted document frequency component
      protected abstract Explanation idfExplain​(BasicStats stats, double freq, double docLen)
      Explain the score of the inverted document frequency component for a single document
      protected abstract double ln​(BasicStats stats, double freq, double docLen)
      compute the document length component
      protected abstract Explanation lnExplain​(BasicStats stats, double freq, double docLen)
      Explain the score of the document length component for a single document
      double score​(BasicStats stats, double freq, double docLen)
      Scores the document doc.
      protected abstract double tf​(BasicStats stats, double freq, double docLen)
      compute the term frequency component
      protected abstract Explanation tfExplain​(BasicStats stats, double freq, double docLen)
      Explain the score of the term frequency component for a single document
      protected abstract double tfln​(BasicStats stats, double freq, double docLen)
      compute the mixed term frequency and document length component
      protected abstract Explanation tflnExplain​(BasicStats stats, double freq, double docLen)
      Explain the score of the mixed term frequency and document length component for a single document
      abstract java.lang.String toString()
      Name of the axiomatic method.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • s

        protected final float s
        hyperparam for the growth function
      • k

        protected final float k
        hyperparam for the primitive weighthing function
      • queryLen

        protected final int queryLen
        the query length
    • Constructor Detail

      • Axiomatic

        public Axiomatic​(float s,
                         int queryLen,
                         float k)
        Constructor setting all Axiomatic hyperparameters
        Parameters:
        s - hyperparam for the growth function
        queryLen - the query length
        k - hyperparam for the primitive weighting function
      • Axiomatic

        public Axiomatic​(float s)
        Constructor setting only s, letting k and queryLen to default
        Parameters:
        s - hyperparam for the growth function
      • Axiomatic

        public Axiomatic​(float s,
                         int queryLen)
        Constructor setting s and queryLen, letting k to default
        Parameters:
        s - hyperparam for the growth function
        queryLen - the query length
      • Axiomatic

        public Axiomatic()
        Default constructor
    • Method Detail

      • score

        public double score​(BasicStats stats,
                            double freq,
                            double docLen)
        Description copied from class: SimilarityBase
        Scores the document doc.

        Subclasses must apply their scoring formula in this class.

        Specified by:
        score in class SimilarityBase
        Parameters:
        stats - the corpus level statistics.
        freq - the term frequency.
        docLen - the document length.
        Returns:
        the score.
      • explain

        protected void explain​(java.util.List<Explanation> subs,
                               BasicStats stats,
                               double freq,
                               double docLen)
        Description copied from class: SimilarityBase
        Subclasses should implement this method to explain the score. expl already contains the score, the name of the class and the doc id, as well as the term frequency and its explanation; subclasses can add additional clauses to explain details of their scoring formulae.

        The default implementation does nothing.

        Overrides:
        explain in class SimilarityBase
        Parameters:
        subs - the list of details of the explanation to extend
        stats - the corpus level statistics.
        freq - the term frequency.
        docLen - the document length.
      • toString

        public abstract java.lang.String toString()
        Name of the axiomatic method.
        Specified by:
        toString in class SimilarityBase
      • tf

        protected abstract double tf​(BasicStats stats,
                                     double freq,
                                     double docLen)
        compute the term frequency component
      • ln

        protected abstract double ln​(BasicStats stats,
                                     double freq,
                                     double docLen)
        compute the document length component
      • tfln

        protected abstract double tfln​(BasicStats stats,
                                       double freq,
                                       double docLen)
        compute the mixed term frequency and document length component
      • idf

        protected abstract double idf​(BasicStats stats,
                                      double freq,
                                      double docLen)
        compute the inverted document frequency component
      • gamma

        protected abstract double gamma​(BasicStats stats,
                                        double freq,
                                        double docLen)
        compute the gamma component (only for F3EXp and F3LOG)
      • tfExplain

        protected abstract Explanation tfExplain​(BasicStats stats,
                                                 double freq,
                                                 double docLen)
        Explain the score of the term frequency component for a single document
        Parameters:
        stats - the corpus level statistics
        freq - number of occurrences of term in the document
        docLen - the document length
        Returns:
        Explanation of how the tf component was computed
      • lnExplain

        protected abstract Explanation lnExplain​(BasicStats stats,
                                                 double freq,
                                                 double docLen)
        Explain the score of the document length component for a single document
        Parameters:
        stats - the corpus level statistics
        freq - number of occurrences of term in the document
        docLen - the document length
        Returns:
        Explanation of how the ln component was computed
      • tflnExplain

        protected abstract Explanation tflnExplain​(BasicStats stats,
                                                   double freq,
                                                   double docLen)
        Explain the score of the mixed term frequency and document length component for a single document
        Parameters:
        stats - the corpus level statistics
        freq - number of occurrences of term in the document
        docLen - the document length
        Returns:
        Explanation of how the tfln component was computed
      • idfExplain

        protected abstract Explanation idfExplain​(BasicStats stats,
                                                  double freq,
                                                  double docLen)
        Explain the score of the inverted document frequency component for a single document
        Parameters:
        stats - the corpus level statistics
        freq - number of occurrences of term in the document
        docLen - the document length
        Returns:
        Explanation of how the idf component was computed