A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/nativelibs4java/scalaxy-streams below:

nativelibs4java/scalaxy-streams: Scalaxy/Streams: make your Scala collections faster!

Latest release: 0.3.4 (2014-11-04, see Changelog)

Quick links:

Scalaxy/Streams makes your Scala 2.11.x collections code faster (official heir to ScalaCL and Scalaxy/Loops, by same author):

// For instance, given the following array:
val array = Array(1, 2, 3, 4)

// The following for comprehension:
for ((item, i) <- array.zipWithIndex; if item % 2 == 0) {
  println(s"array[$i] = $item")
}

// Is desugared by Scala to (slightly simplified):
array.zipWithIndex.withFilter((pair: (Int, Int)) => pair match {
  case (item: Int, i: Int) => true
  case _ => false
}).withFilter((pair: (Int, Int)) => pair match {
  case (item, i) =>
    item % 2 == 0
}).foreach((pair: (Int, Int)) => pair match {
  case (item, i) =>
    println(s"array[$i] = $item")
})
// Which will perform as badly and generate as many class files as you might fear.

// Scalaxy/Streams will simply rewrite it to something like:
val array = Array(1, 2, 3, 4)
var i = 0;
val length = array.length
while (i < length) {
  val item = array(i)
  if (item % 2 == 0) {
    println(s"array[$i] = $item")
  }
}

Caveat: Scalaxy/Streams is still a young project and relies on experimental Scala features (macros), so:

Scalaxy/Streams rewrites streams with the following components:

The output type of each optimized stream is always the same as the original, but when nested streams are encountered in flatMap operations many intermediate outputs can typically be skipped, saving up on memory usage and execution time.

Note: known bugs are usually accommodated in Strategies.hasKnownLimitationOrBug when possible.

You can either use Scalaxy/Streams's compiler plugin to compile your whole project, or use its optimize macro to choose specific blocks of code to optimize.

You can always disable loop optimizations by recompiling with the environment variable SCALAXY_STREAMS_OPTIMIZE=0 or the System property scalaxy.streams.optimize=false set:

SCALAXY_STREAMS_OPTIMIZE=0 sbt clean compile ...

Or if you're not using sbt:

scalac -J-Dscalaxy.streams.optimize=false ...

If you're using sbt 0.13.0+, just put the following lines in build.sbt:

And of course, if you're serious about performance you should add the following line to your build.sbt file:

scalacOptions ++= Seq("-optimise", "-Yclosure-elim", "-Yinline")

(also consider -Ybackend:GenBCode)

Scalaxy/Streams is not available for Scala 2.10.x, so some extra legwork is needed to use it in a cross-compiling setup:

scalaVersion := "2.11.6"

crossScalaVersions := Seq("2.10.5")

resolvers += Resolver.sonatypeRepo("snapshots"),

libraryDependencies <<= (scalaVersion, libraryDependencies) { (scalaVersion, libraryDependencies) =>
  if (scalaVersion.matches("2\\.10\\..*")) {
    libraryDependencies
  } else {
    // libraryDependencies :+ compilerPlugin("com.nativelibs4java" %% "scalaxy-streams" % "0.3.4")
    libraryDependencies :+ compilerPlugin("com.nativelibs4java" %% "scalaxy-streams" % "0.4-SNAPSHOT")
  }
}

autoCompilerPlugins <<= scalaVersion(scalaVersion => !scalaVersion.matches("2\\.10\\..*"))

scalacOptions <++= scalaVersion map {
  case sv if !scalaVersion.matches("2\\.10\\..*") => Seq("-Xplugin-require:scalaxy-streams")
  case _ => Seq[String]()
}

scalacOptions in Test ~= (_ filterNot (_ == "-Xplugin-require:scalaxy-streams"))

scalacOptions in Test <++= scalaVersion map {
  case sv if !scalaVersion.matches("2\\.10\\..*") => Seq("-Xplugin-disable:scalaxy-streams")
  case _ => Seq[String]()
}

With Maven, you'll need this in your pom.xml file:

The Scalaxy/Stream compiler plugin is easy to setup in Eclipse with the Scala IDE plugin:

Scalaxy/Streams is a rewrite of ScalaCL using the awesome new (and experimental) reflection APIs from Scala 2.10, and the awesome quasiquotes from Scala 2.11.

The architecture is very simple: Scalaxy/Streams deals with... streams. A stream is comprised of:

One particular operation, FlatMapOp, may contain nested streams, which allows for the chaining of complex for comprehensions:

val n = 20;
// The following for comprehension:
for (i <- 0 to n;
     ii = i * i;
     j <- i to n;
     jj = j * j;
     if (ii - jj) % 2 == 0;
     k <- (i + j) to n)
  yield { (ii, jj, k) }

// Is recognized by Scalaxy/Stream as the following stream:
// Range.map.flatMap(Range.map.withFilter.flatMap(Range.map)) -> IndexedSeq

Special care is taken of tuples, by representing input and output values of stream components as tuploids (a tuploid is recursively defined as either a scalar or a tuple of tuploids).

Careful tracking of input, transformation and output of tuploids across stream components allows to optimize unneeded tuples away, while materializing or preserving needed ones (making TransformationClosure the most complex piece of code of the project).

A conservative whitelist-based side-effect analysis allows to detect "pure" functions (e.g. (x: Int) => x + 1), and "probably pure" ones (e.g. (x: Any) => x.toString). Different optimizations strategies then decide what is worth / safe to optimize (for instance, most List operations are highly optimized in the standard Scala library, so it only makes sense to optimize List-based streams if there's more than one operation in the call chain).

Finally, the cake pattern is used to assemble the source, ops, sink and stream extractors together with macro or compiler plugin universes.

Some collection streams should not be optimized, either because they're known to be already "as fast as possible", or because some of their operations have side-effects would behave differently once "optimized".

There are 4 different optimization strategies:

When using the optimize macro, strategies can be enabled locally by an import:

```scala
import scalaxy.streams.optimize
import scalaxy.streams.strategy.safer
optimize {
  ...
}
```

The global default strategy can also be set through the SCALAXY_STREAMS_STRATEGY environment variable, or the scalaxy.streams.strategy Java property:

```
SCALAXY_STREAMS_STRATEGY=aggressive sbt clean run
```

```
scalac -J-Dscalaxy.streams.strategy=aggressive ...
```

Found a bug? Please report it (your help will be much appreciated!).

If you want to build / test / hack on this project:

Incidentally, using Scalaxy will reduce the number of classes generated by scalac and will produce an overall smaller code. To witness the difference (68K vs. 172K as of June 12th 2014):

    git clone git://github.com/ochafik/Scalaxy.git
    cd Scalaxy/Example
    SCALAXY_STREAMS_OPTIMIZE=1 sbt clean compile && du -h target/scala-2.11/classes/
    SCALAXY_STREAMS_OPTIMIZE=0 sbt clean compile && du -h target/scala-2.11/classes/
Why is this not part of Scala?

Good question! When ScalaCL was announced in 2009, it generated lots of interest in the community, but at that time it was just a dirty hack.

Crafting an optimization engine that works in all cases / doesn't introduce bugs is a very hard and time-consuming problem, and I've had very little time for Scala in the past years. And most of that time was burnt rewriting ScalaCL / Scalaxy a couple of times because of API changes (painfully bleeding-edge RCs, new reflect API, quasiquotes...) and because I got less... unexperienced.

As for the Scala Team, I did get some enthusiastic reaction / life-saving advice from Paul Philips and Eugene Burmako, but my understanding is that the official approach to optimizations is to not do any library-specific hacks (see this thread). Making inlining work well in general is already big enough a challenge and they have limited compiler developer resources. And Scalaxy/Streams works well outside the compiler for the adventurous users (and optimizes Scala itself just fine: ), so why make the compiler more complex?

Coughs if one of the Scala forks (TypeLevel, policy) want to bundle Scalaxy/Streams in their distro, well, be my BSD license guest.

And if anyone wants to tackle SI-1338, I'm happy to help (a bit :-)).

Scalaxy/Streams can optimize itself, although that is still being experimented with:

./Resources/scripts/self_optimize.sh

To use the (experimental!) self-optimized compiler plugin / macros:

scalaVersion := "2.11.6"

autoCompilerPlugins := true

addCompilerPlugin("com.nativelibs4java" %% "scalaxy-streams-experimental-self-optimized" % "0.4-SNAPSHOT")

scalacOptions += "-Xplugin-require:scalaxy-streams"

resolvers += Resolver.sonatypeRepo("snapshots")

Your feedback is precious: did you run into issues with the self-optimized plugin? Do you find it any faster?

(TODO(ochafik): check that the self-optimized version compiles scala itself just as fine as the original)


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4