Class FileTypeUtil

  • All Implemented Interfaces:
    SyntaxConstants

    public final class FileTypeUtil
    extends java.lang.Object
    implements SyntaxConstants
    Utility methods that help you determine what type of code is in a file, to determine how to syntax highlight it. Methods are provided to both analyze the file name and the actual file content.

    Typically, you'll want to inspect the file name before loading the file into an RSyntaxTextArea or TextEditorPane instance for best performance. Here's an example of how to do so:

     File file = getFileToOpen();
     // Open the file for editing
     TextEditorPane textArea = new TextEditorPane();
     textArea.load(FileLocation.create(file));
     // Guess the type of code to use for syntax highlighting
     String style = FileTypeUtil.get().guessContentType(file);
     textArea.setSyntaxEditingStyle(style);
     
    Sometimes you won't be able to identify the type of code in a file or stream; for example, if there is no extension on a shell script, or if you're displaying output read from a stream (say HTML or XML) instead of a flat file. In such a case, you can try to guess the content's file type as follows:
     File file = getFileToOpen();
     // Open the file for editing
     TextEditorPane textArea = new TextEditorPane();
     textArea.load(FileLocation.create(file));
     // Guess the type of code to use for syntax highlighting
     String style = FileTypeUtil.get().guessContentType(file);
     if (style == null) {
         style = FileTypeUtil.get().guessContentType(textArea);
     }
     textArea.setSyntaxEditingStyle(style);
     
    This logic primarily looks at the first line of the content and attempts to identify the following:
    • A shebang, and if so, what file type is being interpreted
    • Whether there's an XML processing instruction
    • Whether there's an HTML doctype tag
    This logic is in a separate method from that which checks the file name primarily for performance. Rather than open the file twice (once to determine the content type, and again to read into the text area), it's better to simply read the content into the text area as SyntaxConstants#SYNTAX_STYLE_NONE, then guess the content type as shown above.
    Version:
    1.0
    • Field Detail

      • map

        private java.util.Map<java.lang.String,​java.util.List<java.lang.String>> map
      • DEFAULT_IGNORE_BACKUP_EXTENSIONS

        private static final boolean DEFAULT_IGNORE_BACKUP_EXTENSIONS
        See Also:
        Constant Field Values
    • Constructor Detail

      • FileTypeUtil

        private FileTypeUtil()
    • Method Detail

      • get

        public static FileTypeUtil get()
        Returns the singleton instance of this class.
        Returns:
        The singleton instance of this class.
      • fileFilterToPattern

        public static java.util.regex.Pattern fileFilterToPattern​(java.lang.String fileFilter)
        Converts a String representing a wildcard file filter into a Pattern containing a regular expression good for finding files that match the wildcard expression.

        Example: For

        String regEx = FileTypeUtil.get().fileFilterToPattern("*.c");

        the returned pattern will match ^.*\.c$.

        Case-sensitivity is taken into account appropriately.

        Parameters:
        fileFilter - The file filter for which to create equivalent regular expressions. This filter can currently only contain the wildcards '*' and '?'.
        Returns:
        A Pattern representing an equivalent regular expression for the string passed in.
        Throws:
        java.util.regex.PatternSyntaxException - If the file filter could not be parsed.
      • fileFilterToPatternImpl

        private static java.lang.String fileFilterToPatternImpl​(java.lang.String filter)
      • getDefaultContentTypeToFilterMap

        public java.util.Map<java.lang.String,​java.util.List<java.lang.String>> getDefaultContentTypeToFilterMap()
        Returns the mapping of content types to lists of extensions used by this class by default.
        Returns:
        The mapping.
      • guessContentType

        public java.lang.String guessContentType​(RSyntaxTextArea textArea)
        Sets the text area's highlighting style based on its content (e.g. whether it contains "#!" at the top).
        Parameters:
        textArea - The text area to examine.
        Returns:
        The guessed content type. This may be SyntaxConstants.SYNTAX_STYLE_NONE if nothing can be determined, but will never be null.
        See Also:
        SyntaxConstants, guessContentType(File), guessContentType(File, boolean)
      • guessContentType

        public java.lang.String guessContentType​(java.io.File file)
        Guesses the type of content in a file, based on its name. Backup extensions will be ignored.
        Parameters:
        file - The file, which may be null.
        Returns:
        The guessed file type. This may be SyntaxConstants.SYNTAX_STYLE_NONE if nothing can be determined, but will never be null.
        See Also:
        SyntaxConstants, guessContentType(File, boolean), guessContentType(RSyntaxTextArea)
      • guessContentType

        public java.lang.String guessContentType​(java.io.File file,
                                                 java.util.Map<java.lang.String,​java.util.List<java.lang.String>> filters)
        Guesses the type of content in a file, based on its name. Backup extensions will be ignored.

        Note you'll typically only need to call this overload if your application implements syntax highlighting for additional/custom languages, or supports syntax highlighting files with an extension the default implementation doesn't know about.

        Parameters:
        file - The file, which may be null.
        filters - The map of SyntaxConstants values to lists of wildcard filters. If this is null, a default set of filters is used.
        Returns:
        The guessed file type. This may be SyntaxConstants.SYNTAX_STYLE_NONE if nothing can be determined, but will never be null.
        See Also:
        SyntaxConstants, guessContentType(File, boolean), guessContentType(RSyntaxTextArea)
      • guessContentType

        public java.lang.String guessContentType​(java.io.File file,
                                                 boolean ignoreBackupExtensions)
        Guesses the type of content in a file, based on its name.
        Parameters:
        file - The file, which may be null.
        ignoreBackupExtensions - Whether to ignore backup extensions.
        Returns:
        The guessed file type. This may be SyntaxConstants.SYNTAX_STYLE_NONE if nothing can be determined, but will never be null.
        See Also:
        SyntaxConstants, guessContentType(File), guessContentType(RSyntaxTextArea)
      • guessContentType

        public java.lang.String guessContentType​(java.io.File file,
                                                 java.util.Map<java.lang.String,​java.util.List<java.lang.String>> filters,
                                                 boolean ignoreBackupExtensions)
        Guesses the type of content in a file, based on its name.

        Note you'll typically only need to call this overload if your application implements syntax highlighting for additional/custom languages, or supports syntax highlighting files with an extension the default implementation doesn't know about.

        Parameters:
        file - The file, which may be null.
        filters - The map of SyntaxConstants values to lists of wildcard filters. If this is null, a default set of filters is used.
        ignoreBackupExtensions - Whether to ignore backup extensions.
        Returns:
        The guessed file type. This may be SyntaxConstants.SYNTAX_STYLE_NONE if nothing can be determined, but will never be null.
        See Also:
        SyntaxConstants, guessContentType(File), guessContentType(RSyntaxTextArea)
      • guessContentTypeImpl

        private static java.lang.String guessContentTypeImpl​(java.lang.String fileName,
                                                             java.util.Map<java.lang.String,​java.util.List<java.lang.String>> filters)
        Looks for a syntax style for a file name in a given map.
        Parameters:
        fileName - The file name, possibly with a backup extension removed.
        filters - The map of SyntaxConstants values to lists of wildcard filters.
        Returns:
        The syntax style for the file, or null if nothing could be determined.
      • initFiltersImpl

        private static void initFiltersImpl​(java.util.Map<java.lang.String,​java.util.List<java.lang.String>> map,
                                            java.lang.String syntax,
                                            java.lang.String... filters)
      • initializeFilters

        private void initializeFilters()
      • stripBackupExtensions

        public static java.lang.String stripBackupExtensions​(java.lang.String fileName)
        Strips the following extensions from the end of a file name, if they are there:
        • .orig
        • .bak
        • .old
        The idea is that these are typically backup files, and when the extension can be used to deduce a file's type/content, that extension should be ignored.
        Parameters:
        fileName - The file name. This may be null.
        Returns:
        The same file name, with any of the above extensions removed.