Class Joining

All Implemented Interfaces:
Collector<CharSequence,Joining.Accumulator,String>

An advanced implementation of joining Collector. This collector is capable to join the input CharSequence elements with given delimiter optionally wrapping into given prefix and suffix and optionally limiting the length of the resulting string (in Unicode code units, code points or grapheme clusters) adding the specified ellipsis sequence. This collector supersedes the standard JDK Collectors.joining(CharSequence, CharSequence, CharSequence) collectors family.

This collector is short-circuiting when the string length is limited in either of ways. Otherwise it's not short-circuiting.

Every specific collector represented by this class is immutable, so you can share it. A bunch of methods is provided to create a new collector based on this one.

To create Joining collector use with(CharSequence) static method and specify the delimiter. For further setup use specific instance methods which return new Joining objects like this:


 StreamEx.of(source).collect(Joining.with(", ").wrap("[", "]")
         .maxCodePoints(100).cutAtWord());
 

The intermediate accumulation type of this collector is the implementation detail and not exposed to the API. If you want to cast it to Collector type, use ? as accumulator type variable:


 Collector<CharSequence, ?, String> joining = Joining.with(", ");
 
Since:
0.4.1
  • Field Details

    • CUT_ANYWHERE

      private static final int CUT_ANYWHERE
      See Also:
    • CUT_CODEPOINT

      private static final int CUT_CODEPOINT
      See Also:
    • CUT_GRAPHEME

      private static final int CUT_GRAPHEME
      See Also:
    • CUT_WORD

      private static final int CUT_WORD
      See Also:
    • CUT_BEFORE_DELIMITER

      private static final int CUT_BEFORE_DELIMITER
      See Also:
    • CUT_AFTER_DELIMITER

      private static final int CUT_AFTER_DELIMITER
      See Also:
    • LENGTH_CHARS

      private static final int LENGTH_CHARS
      See Also:
    • LENGTH_CODEPOINTS

      private static final int LENGTH_CODEPOINTS
      See Also:
    • LENGTH_GRAPHEMES

      private static final int LENGTH_GRAPHEMES
      See Also:
    • LENGTH_ELEMENTS

      private static final int LENGTH_ELEMENTS
      See Also:
    • delimiter

      private final String delimiter
    • ellipsis

      private final String ellipsis
    • prefix

      private final String prefix
    • suffix

      private final String suffix
    • cutStrategy

      private final int cutStrategy
    • lenStrategy

      private final int lenStrategy
    • maxLength

      private final int maxLength
    • limit

      private int limit
    • delimCount

      private int delimCount
  • Constructor Details

    • Joining

      private Joining(String delimiter, String ellipsis, String prefix, String suffix, int cutStrategy, int lenStrategy, int maxLength)
  • Method Details

    • init

      private void init()
    • length

      private int length(CharSequence s, boolean content)
    • copy

      private static int copy(char[] buf, int pos, String str)
    • copyCut

      private int copyCut(char[] buf, int pos, String str, int limit, int cutStrategy)
    • finisherNoOverflow

      private String finisherNoOverflow(Joining.Accumulator acc)
    • withLimit

      private Joining withLimit(int lenStrategy, int maxLength)
    • withCut

      private Joining withCut(int cutStrategy)
    • with

      public static Joining with(CharSequence delimiter)
      Returns a Collector that concatenates the input elements, separated by the specified delimiter, in encounter order.

      This collector is similar to Collectors.joining(CharSequence), but can be further set up in a flexible way, for example, specifying the maximal allowed length of the resulting String.

      Parameters:
      delimiter - the delimiter to be used between each element
      Returns:
      A Collector which concatenates CharSequence elements, separated by the specified delimiter, in encounter order
      See Also:
    • wrap

      public Joining wrap(CharSequence prefix, CharSequence suffix)
      Returns a Collector which behaves like this collector, but additionally wraps the result with the specified prefix and suffix.

      The collector returned by Joining.with(delimiter).wrap(prefix, suffix) is equivalent to Collectors.joining(CharSequence, CharSequence, CharSequence), but can be further set up in a flexible way, for example, specifying the maximal allowed length of the resulting String.

      If length limit is specified for the collector, the prefix length and the suffix length are also counted towards this limit. If the length of the prefix and the suffix exceed the limit, the resulting collector will not accumulate any elements and produce the same output. For example, stream.collect(Joining.with(",").wrap("prefix", "suffix").maxChars(9)) will produce "prefixsuf" string regardless of the input stream content.

      You may wrap several times: Joining.with(",").wrap("[", "]").wrap("(", ")") is equivalent to Joining.with(",").wrap("([", "])").

      Parameters:
      prefix - the sequence of characters to be used at the beginning of the joined result
      suffix - the sequence of characters to be used at the end of the joined result
      Returns:
      a new Collector which wraps the result with the specified prefix and suffix.
    • ellipsis

      public Joining ellipsis(CharSequence ellipsis)
      Returns a Collector which behaves like this collector, but uses the specified ellipsis CharSequence instead of default "..." when the string limit (if specified) is reached.
      Parameters:
      ellipsis - the sequence of characters to be used at the end of the joined result to designate that not all of the input elements are joined due to the specified string length restriction.
      Returns:
      a new Collector which will use the specified ellipsis instead of current setting.
    • maxChars

      public Joining maxChars(int limit)
      Returns a Collector which behaves like this collector, but sets the maximal length of the resulting string to the specified number of UTF-16 characters (or Unicode code units). This setting overwrites any limit previously set by maxChars(int), maxCodePoints(int), maxGraphemes(int) or maxElements(int) call.

      The String produced by the resulting collector is guaranteed to have length which does not exceed the specified limit. An ellipsis sequence (by default "...") is used to designate whether the limit was reached. Use ellipsis(CharSequence) to set custom ellipsis sequence.

      The collector returned by this method is short-circuiting: it may not process all the input elements if the limit is reached.

      Parameters:
      limit - the maximal number of UTF-16 characters in the resulting String.
      Returns:
      a new Collector which will produce String no longer than given limit.
    • maxCodePoints

      public Joining maxCodePoints(int limit)
      Returns a Collector which behaves like this collector, but sets the maximal number of Unicode code points of the resulting string. This setting overwrites any limit previously set by maxChars(int), maxCodePoints(int), maxGraphemes(int) or maxElements(int) call.

      The String produced by the resulting collector is guaranteed to have no more code points than the specified limit. An ellipsis sequence (by default "...") is used to designate whether the limit was reached. Use ellipsis(CharSequence) to set custom ellipsis sequence.

      The collector returned by this method is short-circuiting: it may not process all the input elements if the limit is reached.

      Parameters:
      limit - the maximal number of code points in the resulting String.
      Returns:
      a new Collector which will produce String no longer than given limit.
    • maxGraphemes

      public Joining maxGraphemes(int limit)
      Returns a Collector which behaves like this collector, but sets the maximal number of grapheme clusters. This setting overwrites any limit previously set by maxChars(int), maxCodePoints(int), maxGraphemes(int) or maxElements(int) call.

      The grapheme cluster is defined in Unicode Text Segmentation technical report. Basically, it counts base character and the following combining characters as single object. The String produced by the resulting collector is guaranteed to have no more grapheme clusters than the specified limit. An ellipsis sequence (by default "...") is used to designate whether the limit was reached. Use ellipsis(CharSequence) to set custom ellipsis sequence.

      The collector returned by this method is short-circuiting: it may not process all the input elements if the limit is reached.

      Parameters:
      limit - the maximal number of grapheme clusters in the resulting String.
      Returns:
      a new Collector which will produce String no longer than given limit.
    • maxElements

      public Joining maxElements(int limit)
      Returns a Collector which behaves like this collector, but sets the maximal number of elements to join. This setting overwrites any limit previously set by maxChars(int), maxCodePoints(int) or maxGraphemes(int) or maxElements(int) call.

      The String produced by the resulting collector is guaranteed to have no more input elements than the specified limit. An ellipsis sequence (by default "...") is used to designate whether the limit was reached. Use ellipsis(CharSequence) to set custom ellipsis sequence. The cutting strategy is mostly irrelevant for this mode except cutBeforeDelimiter().

      The collector returned by this method is short-circuiting: it may not process all the input elements if the limit is reached.

      Parameters:
      limit - the maximal number of input elements in the resulting String.
      Returns:
      a new Collector which will produce String no longer than given limit.
      Since:
      0.6.7
    • cutAnywhere

      public Joining cutAnywhere()
      Returns a Collector which behaves like this collector, but cuts the resulting string at any point when limit is reached.

      The resulting collector will produce String which length is exactly equal to the specified limit if the limit is reached. If used with maxChars(int), the resulting string may be cut in the middle of surrogate pair.

      Returns:
      a new Collector which cuts the resulting string at any point when limit is reached.
    • cutAtCodePoint

      public Joining cutAtCodePoint()
      Returns a Collector which behaves like this collector, but cuts the resulting string between any code points when limit is reached.

      The resulting collector will not split the surrogate pair when used with maxChars(int) or maxCodePoints(int). However it may remove the combining character which may result in incorrect rendering of the last displayed grapheme.

      Returns:
      a new Collector which cuts the resulting string between code points.
    • cutAtGrapheme

      public Joining cutAtGrapheme()
      Returns a Collector which behaves like this collector, but cuts the resulting string at grapheme cluster boundary when limit is reached. This is the default behavior.

      The grapheme cluster is defined in Unicode Text Segmentation technical report. Thus the resulting collector will not split the surrogate pair and will preserve any combining characters or remove them with the base character.

      Returns:
      a new Collector which cuts the resulting string at grapheme cluster boundary.
    • cutAtWord

      public Joining cutAtWord()
      Returns a Collector which behaves like this collector, but cuts the resulting string at word boundary when limit is reached.

      The beginning and end of every input stream element or delimiter is always considered as word boundary, so the stream of "one", "two three" collected with Joining.with("").maxChars(n).ellipsis("").cutAtWord() may produce the following strings depending on n:

      
       ""
       "one"
       "onetwo"
       "onetwo "
       "onetwo three"
       
      Returns:
      a new Collector which cuts the resulting string at word boundary.
    • cutBeforeDelimiter

      public Joining cutBeforeDelimiter()
      Returns a Collector which behaves like this collector, but cuts the resulting string before the delimiter when limit is reached.
      Returns:
      a new Collector which cuts the resulting string at before the delimiter.
    • cutAfterDelimiter

      public Joining cutAfterDelimiter()
      Returns a Collector which behaves like this collector, but cuts the resulting string after the delimiter when limit is reached.
      Returns:
      a new Collector which cuts the resulting string at after the delimiter.
    • supplier

      public Supplier<Joining.Accumulator> supplier()
    • accumulator

    • combiner

      public BinaryOperator<Joining.Accumulator> combiner()
    • finisher

      public Function<Joining.Accumulator,String> finisher()
    • characteristics

      public Set<Collector.Characteristics> characteristics()
    • finished

      Specified by:
      finished in class CancellableCollector<CharSequence,Joining.Accumulator,String>