Class MagicMimeMimeDetector
- From a JVM system property
magic-mime
i.e-Dmagic-mime=../my/magic/mime/rules
- From any file named
magic.mime
that can be found on the classpath - From a file named
.magic.mime
in the users home directory - From the normal Unix locations
/usr/share/file/magic.mime
and/etc/magic.mime
(in that order) - From the internal
magic.mime
fileeu.medsea.mimeutil.magic.mime
if, and only if, no files are located in step 4 above.
You can add new mime mapping rules using the syntax defined for the Unix magic.mime file by placing these rules in any of the files or locations listed above. You can also change an existing mapping rule by redefining the existing rule in one of the files listed above. This is handy for some of the more sketchy rules defined in the existing Unix magic.mime files.
We extended the string type rule which allows you to match strings in a file where you do not know the actual offset of the string containing magic file information it goes something like “what I am looking for will be ‘somewhere’ within the next n characters” from this location. This is an important improvement to the string matching rules especially for text based documents such as HTML and XML formats. The reasoning for this was that the rules for matching SVG images defined in the original 'magic.mime' file hardly ever worked, this is because of the fixed offset definitions within the magic rule format. As XML documents generally have an XML declaration that can contain various optional attributes the length of this header often cannot be determined, therefore we cannot know that the DOCTYPE declaration for an SVG xml file starts at “this” location, all we can say is that, if this is an SVG xml file then it will have an SVG DOCTYPE somewhere near the beginning of the file and probably within the first 1024 characters. So we test for the xml declaration and then we test for the DOCTYPE within a specified number of characters and if found then we match this rule. This extension can be used to better identify ALL of the XML type mime mappings in the current 'magic.mime' file. Remember though, as we stated earlier mime type matching using any of the mechanisms supported is not an exact science and should always be viewed as a 'best guess' and not as a 'definite match'.
An example of overriding the PNG and SVG rules can be found in our internal 'magic.mime' file located in the test_files directory (this file is NOT used when locating rules and is used for testing purposes only). This PNG rule overrides the original PNG rule defined in the 'magic.mime' file we took from the Internet, and the SVG rule overrides the SVG detection also defined in the original 'magic.mime' file
#PNG Image Format 0 string \211PNG\r\n\032\n image/png #SVG Image Format # We know its an XML file so it should start with an XML declaration. 0 string \<?xml\ version= text/xml # As the XML declaration in an XML file can be short or extended we cannot know # exactly where the declaration ends i.e. how long it is, # also it could be terminated by a new line(s) or a space(s). # So the next line states that somewhere after the 15th character position we should find the DOCTYPE declaration. # This DOCTYPE declaration should be within 1024 characters from the 15th character >15 string>1024< \<!DOCTYPE\ svg\ PUBLIC\ "-//W3C//DTD\ SVG image/svg+xml
As you can see the extension is defined using the syntax string>bufsizeinvalid input: '<'. It can only be used on a string type and basically means match this within bufsize character from the position defined at the beginning of the line. This rule is much more verbose than required as we really only need to check for the presence of SVG. As we said earlier, this is a test case file and not used by the utility under normal circumstances. The test mime-types.properties and magic.mime files we use can be located in the test_files directory of this distribution.
We use the application/directory
mime type to identify
directories. Even though this is not an official mime type it seems to be
well accepted on the net as an unofficial mime type so we thought it was OK
for us to use as well.
This class is auto loaded by MimeUtil as it has an entry in the file called MimeDetectors. MimeUtil reads this file at startup and calls Class.forName() on each entry found. This mean the MimeDetector must have a no arg constructor.
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionAbstract method to be implement by concrete MimeDetector(s).getMimeTypesByteArray
(byte[] data) Get the mime types that may be contained in the data array.getMimeTypesFile
(File file) Defer this call to the InputStream methodgetMimeTypesFileName
(String fileName) Defer this call to the File methodGet the mime types of the data in the specifiedInputStream
.getMimeTypesURL
(URL url) Defer this call to the InputStream methodMethods inherited from class eu.medsea.mimeutil.detector.MimeDetector
closeStream, delete, getMimeTypes, getMimeTypes, getMimeTypes, getMimeTypes, getMimeTypes, getName, init
-
Field Details
-
defaultLocations
-
-
Constructor Details
-
MagicMimeMimeDetector
public MagicMimeMimeDetector()
-
-
Method Details
-
getDescription
Description copied from class:MimeDetector
Abstract method to be implement by concrete MimeDetector(s).- Specified by:
getDescription
in classMimeDetector
- Returns:
- description of this MimeDetector
-
getMimeTypesByteArray
Get the mime types that may be contained in the data array.- Specified by:
getMimeTypesByteArray
in classMimeDetector
- Parameters:
data
- . The byte array that contains data we want to detect mime types from.- Returns:
- the mime types.
- Throws:
MimeException
- if for instance we try to match beyond the end of the data.UnsupportedOperationException
-
getMimeTypesInputStream
Get the mime types of the data in the specifiedInputStream
. Therefore, theInputStream
must support mark and reset (seeInputStream.markSupported()
). If it does not support mark and reset, anMimeException
is thrown.- Specified by:
getMimeTypesInputStream
in classMimeDetector
- Parameters:
in
- the stream from which to read the data.- Returns:
- the mime types.
- Throws:
MimeException
- if the specifiedInputStream
does not support mark and reset (seeInputStream.markSupported()
).UnsupportedOperationException
-
getMimeTypesFileName
Defer this call to the File method- Specified by:
getMimeTypesFileName
in classMimeDetector
- Parameters:
fileName
-- Returns:
- Collection of matched MimeType(s)
- Throws:
UnsupportedOperationException
-
getMimeTypesURL
Defer this call to the InputStream method- Specified by:
getMimeTypesURL
in classMimeDetector
- Returns:
- Collection of matched MimeType(s)
- Throws:
UnsupportedOperationException
-
getMimeTypesFile
Defer this call to the InputStream method- Specified by:
getMimeTypesFile
in classMimeDetector
- Parameters:
file
-- Returns:
- Collection of matched MimeType(s)
- Throws:
UnsupportedOperationException
-