Class TextPosition


  • public class TextPosition
    extends java.lang.Object
    This represents a string and a position on the screen of those characters.
    Version:
    $Revision: 1.12 $
    Author:
    Ben Litchfield
    • Method Summary

      All Methods Instance Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      boolean contains​(TextPosition tp2)
      Determine if this TextPosition logically contains another (i.e. they overlap and should be rendered on top of each other).
      java.lang.String getCharacter()
      Return the string of characters stored in this object.
      int[] getCodePoints()
      Return the codepoints of the characters stored in this object.
      float getDir()
      Return the direction/orientation of the string in this object based on its text matrix.
      PDFont getFont()
      This will get the font for the text being drawn.
      float getFontSize()
      This will get the font size that this object is suppose to be drawn at.
      float getFontSizeInPt()
      This will get the font size in pt.
      float getHeight()
      This will get the maximum height of all characters in this string.
      float getHeightDir()
      This will get the maximum height of all characters in this string.
      float[] getIndividualWidths()
      Get the widths of each individual character.
      Matrix getTextPos()
      Return the Matrix textPos stored in this object.
      float getWidth()
      This will get the width of the string when page rotation adjusted coordinates are used.
      float getWidthDirAdj()
      This will get the width of the string when text direction adjusted coordinates are used.
      float getWidthOfSpace()
      This will get the width of a space character.
      float getWordSpacing()
      Deprecated.
      float getX()
      This will get the page rotation adjusted x position of the character.
      float getXDirAdj()
      This will get the text direction adjusted x position of the character.
      float getXScale()  
      float getY()
      This will get the y position of the text, adjusted so that 0,0 is upper left and it is adjusted based on the page rotation.
      float getYDirAdj()
      This will get the y position of the text, adjusted so that 0,0 is upper left and it is adjusted based on the text direction.
      float getYScale()  
      boolean isDiacritic()  
      void mergeDiacritic​(TextPosition diacritic, TextNormalize normalize)
      Merge a single character TextPosition into the current object.
      java.lang.String toString()
      Show the string data for this text position.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • TextPosition

        protected TextPosition()
        Constructor.
      • TextPosition

        public TextPosition​(PDPage page,
                            Matrix textPositionSt,
                            Matrix textPositionEnd,
                            float maxFontH,
                            float[] individualWidths,
                            float spaceWidth,
                            java.lang.String string,
                            PDFont currentFont,
                            float fontSizeValue,
                            int fontSizeInPt,
                            float ws)
        Constructor.
        Parameters:
        page - Page that the text is located in
        textPositionSt - TextMatrix for start of text (in display units)
        textPositionEnd - TextMatrix for end of text (in display units)
        maxFontH - Maximum height of text (in display units)
        individualWidths - The width of each individual character. (in ? units)
        spaceWidth - The width of the space character. (in display units)
        string - The character to be displayed.
        currentFont - The current for for this text position.
        fontSizeValue - The new font size.
        fontSizeInPt - The font size in pt units.
        ws - The word spacing parameter (in display units)
      • TextPosition

        public TextPosition​(int pageRotation,
                            float pageWidthValue,
                            float pageHeightValue,
                            Matrix textPositionSt,
                            Matrix textPositionEnd,
                            float maxFontH,
                            float individualWidth,
                            float spaceWidth,
                            java.lang.String string,
                            PDFont currentFont,
                            float fontSizeValue,
                            int fontSizeInPt)
        Constructor.
        Parameters:
        pageRotation - rotation of the page that the text is located in
        pageWidthValue - rotation of the page that the text is located in
        pageHeightValue - rotation of the page that the text is located in
        textPositionSt - TextMatrix for start of text (in display units)
        textPositionEnd - TextMatrix for end of text (in display units)
        maxFontH - Maximum height of text (in display units)
        individualWidth - The width of the given character/string. (in ? units)
        spaceWidth - The width of the space character. (in display units)
        string - The character to be displayed.
        currentFont - The current for for this text position.
        fontSizeValue - The new font size.
        fontSizeInPt - The font size in pt units.
      • TextPosition

        public TextPosition​(int pageRotation,
                            float pageWidthValue,
                            float pageHeightValue,
                            Matrix textPositionSt,
                            float endXValue,
                            float endYValue,
                            float maxFontH,
                            float individualWidth,
                            float spaceWidth,
                            java.lang.String string,
                            PDFont currentFont,
                            float fontSizeValue,
                            int fontSizeInPt)
        Constructor.
        Parameters:
        pageRotation - rotation of the page that the text is located in
        pageWidthValue - rotation of the page that the text is located in
        pageHeightValue - rotation of the page that the text is located in
        textPositionSt - TextMatrix for start of text (in display units)
        endXValue - x coordinate of the end position
        endYValue - y coordinate of the end position
        maxFontH - Maximum height of text (in display units)
        individualWidth - The width of the given character/string. (in ? units)
        spaceWidth - The width of the space character. (in display units)
        string - The character to be displayed.
        currentFont - The current for for this text position.
        fontSizeValue - The new font size.
        fontSizeInPt - The font size in pt units.
      • TextPosition

        public TextPosition​(int pageRotation,
                            float pageWidthValue,
                            float pageHeightValue,
                            Matrix textPositionSt,
                            float endXValue,
                            float endYValue,
                            float maxFontH,
                            float individualWidth,
                            float spaceWidth,
                            java.lang.String string,
                            int[] codePoints,
                            PDFont currentFont,
                            float fontSizeValue,
                            int fontSizeInPt)
        Constructor.
        Parameters:
        pageRotation - rotation of the page that the text is located in
        pageWidthValue - rotation of the page that the text is located in
        pageHeightValue - rotation of the page that the text is located in
        textPositionSt - TextMatrix for start of text (in display units)
        endXValue - x coordinate of the end position
        endYValue - y coordinate of the end position
        maxFontH - Maximum height of text (in display units)
        individualWidth - The width of the given character/string. (in ? units)
        spaceWidth - The width of the space character. (in display units)
        string - The character to be displayed.
        codePoints - An array containing the codepoints of the given string.
        currentFont - The current font for this text position.
        fontSizeValue - The new font size.
        fontSizeInPt - The font size in pt units.
    • Method Detail

      • getCharacter

        public java.lang.String getCharacter()
        Return the string of characters stored in this object.
        Returns:
        The string on the screen.
      • getCodePoints

        public int[] getCodePoints()
        Return the codepoints of the characters stored in this object.
        Returns:
        an array containing all codepoints.
      • getTextPos

        public Matrix getTextPos()
        Return the Matrix textPos stored in this object.
        Returns:
        The Matrix containing all infos of the starting textposition
      • getDir

        public float getDir()
        Return the direction/orientation of the string in this object based on its text matrix.
        Returns:
        The direction of the text (0, 90, 180, or 270)
      • getX

        public float getX()
        This will get the page rotation adjusted x position of the character. This is adjusted based on page rotation so that the upper left is 0,0.
        Returns:
        The x coordinate of the character.
      • getXDirAdj

        public float getXDirAdj()
        This will get the text direction adjusted x position of the character. This is adjusted based on text direction so that the first character in that direction is in the upper left at 0,0.
        Returns:
        The x coordinate of the text.
      • getY

        public float getY()
        This will get the y position of the text, adjusted so that 0,0 is upper left and it is adjusted based on the page rotation.
        Returns:
        The adjusted y coordinate of the character.
      • getYDirAdj

        public float getYDirAdj()
        This will get the y position of the text, adjusted so that 0,0 is upper left and it is adjusted based on the text direction.
        Returns:
        The adjusted y coordinate of the character.
      • getWidth

        public float getWidth()
        This will get the width of the string when page rotation adjusted coordinates are used.
        Returns:
        The width of the text in display units.
      • getWidthDirAdj

        public float getWidthDirAdj()
        This will get the width of the string when text direction adjusted coordinates are used.
        Returns:
        The width of the text in display units.
      • getHeight

        public float getHeight()
        This will get the maximum height of all characters in this string.
        Returns:
        The maximum height of all characters in this string.
      • getHeightDir

        public float getHeightDir()
        This will get the maximum height of all characters in this string.
        Returns:
        The maximum height of all characters in this string.
      • getFontSize

        public float getFontSize()
        This will get the font size that this object is suppose to be drawn at.
        Returns:
        The font size.
      • getFontSizeInPt

        public float getFontSizeInPt()
        This will get the font size in pt. To get this size we have to multiply the pdf-fontsize and the scaling from the textmatrix
        Returns:
        The font size in pt.
      • getFont

        public PDFont getFont()
        This will get the font for the text being drawn.
        Returns:
        The font size.
      • getWordSpacing

        @Deprecated
        public float getWordSpacing()
        Deprecated.
        This will get the current word spacing.
        Returns:
        The current word spacing.
      • getWidthOfSpace

        public float getWidthOfSpace()
        This will get the width of a space character. This is useful for some algorithms such as the text stripper, that need to know the width of a space character.
        Returns:
        The width of a space character.
      • getXScale

        public float getXScale()
        Returns:
        Returns the xScale.
      • getYScale

        public float getYScale()
        Returns:
        Returns the yScale.
      • getIndividualWidths

        public float[] getIndividualWidths()
        Get the widths of each individual character.
        Returns:
        An array that is the same length as the length of the string.
      • toString

        public java.lang.String toString()
        Show the string data for this text position.
        Overrides:
        toString in class java.lang.Object
        Returns:
        A human readable form of this object.
      • contains

        public boolean contains​(TextPosition tp2)
        Determine if this TextPosition logically contains another (i.e. they overlap and should be rendered on top of each other).
        Parameters:
        tp2 - The other TestPosition to compare against
        Returns:
        True if tp2 is contained in the bounding box of this text.
      • mergeDiacritic

        public void mergeDiacritic​(TextPosition diacritic,
                                   TextNormalize normalize)
        Merge a single character TextPosition into the current object. This is to be used only for cases where we have a diacritic that overlaps an existing TextPosition. In a graphical display, we could overlay them, but for text extraction we need to merge them. Use the contains() method to test if two objects overlap.
        Parameters:
        diacritic - TextPosition to merge into the current TextPosition.
        normalize - Instance of TextNormalize class to be used to normalize diacritic
      • isDiacritic

        public boolean isDiacritic()
        Returns:
        True if the current character is a diacritic char.