org.apache.sysml.api.mlcontext

Class ScriptExecutor



  • public class ScriptExecutor
    extends Object
    ScriptExecutor executes a DML or PYDML Script object using SystemML. This is accomplished by calling the execute(org.apache.sysml.api.mlcontext.Script) method.

    Script execution via the MLContext API typically consists of the following steps:

    1. Language Steps
      1. Parse script into program
      2. Live variable analysis
      3. Validate program
    2. HOP (High-Level Operator) Steps
      1. Construct HOP DAGs
      2. Static rewrites
      3. Intra-/Inter-procedural analysis
      4. Dynamic rewrites
      5. Compute memory estimates
      6. Rewrite persistent reads and writes (MLContext-specific)
    3. LOP (Low-Level Operator) Steps
      1. Contruct LOP DAGs
      2. Generate runtime program
      3. Execute runtime program
      4. Dynamic recompilation

    Modifications to these steps can be accomplished by subclassing ScriptExecutor. For example, the following code will turn off the global data flow optimization check by subclassing ScriptExecutor and overriding the globalDataFlowOptimization method.

    ScriptExecutor scriptExecutor = new ScriptExecutor() {
      // turn off global data flow optimization check
      @Override
      protected void globalDataFlowOptimization() {
        return;
      }
    };
    ml.execute(script, scriptExecutor);

    For more information, please see the execute(org.apache.sysml.api.mlcontext.Script) method.

    • Constructor Summary

      Constructors 
      Constructor and Description
      ScriptExecutor()
      ScriptExecutor constructor.
      ScriptExecutor(org.apache.sysml.conf.DMLConfig config)
      ScriptExecutor constructor, where the configuration properties are passed in.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method and Description
      protected void checkScriptHasTypeAndString()
      Check that the Script object has a type (DML or PYDML) and a string representing the content of the Script.
      protected void cleanupAfterExecution()
      Perform any necessary cleanup operations after program execution.
      protected void cleanupRuntimeProgram()
      If maintainSymbolTable is true, delete all 'remove variable' instructions so as to maintain the values in the symbol table, which are useful when working interactively in an environment such as the Spark Shell.
      protected void constructHops()
      Construct DAGs of high-level operators (HOPs) for each block of statements.
      protected void constructLops()
      Construct DAGs of low-level operators (LOPs) based on the DAGs of high-level operators (HOPs).
      protected void countCompiledMRJobsAndSparkInstructions()
      Count the number of compiled MR Jobs/Spark Instructions in the runtime program and set this value in the statistics.
      protected void createAndInitializeExecutionContext()
      Create an execution context and set its variables to be the symbol table of the script.
      MLResults execute(Script script)
      Execute a DML or PYDML script.
      protected void executeRuntimeProgram()
      Execute the runtime program.
      protected void generateRuntimeProgram()
      Create runtime program.
      org.apache.sysml.parser.DMLProgram getDmlProgram()
      Obtain the program
      org.apache.sysml.parser.DMLTranslator getDmlTranslator()
      Obtain the translator
      org.apache.sysml.runtime.controlprogram.context.ExecutionContext getExecutionContext()
      Obtain the execution context
      org.apache.sysml.runtime.controlprogram.Program getRuntimeProgram()
      Obtain the runtime program
      Script getScript()
      Obtain the Script object associated with this ScriptExecutor
      protected void globalDataFlowOptimization()
      Optimize the program.
      protected void initializeCachingAndScratchSpace()
      Check security, create scratch space, cleanup working directories, initialize caching, and reset statistics.
      boolean isMaintainSymbolTable()
      Obtain whether or not all values should be maintained in the symbol table after execution.
      protected void liveVariableAnalysis()
      Liveness analysis is performed on the program, obtaining sets of live-in and live-out variables by forward and backward passes over the program.
      protected void parseScript()
      Parse the script into an ANTLR parse tree, and convert this parse tree into a SystemML program.
      protected void restoreInputsInSymbolTable()
      Restore the input variables in the symbol table after script execution.
      protected void rewriteHops()
      Apply static rewrites, perform intra-/inter-procedural analysis to propagate size information into functions, apply dynamic rewrites, and compute memory estimates for all HOPs.
      protected void rewritePersistentReadsAndWrites()
      Replace persistent reads and writes with transient reads and writes in the symbol table.
      void setConfig(org.apache.sysml.conf.DMLConfig config)
      Set the SystemML configuration properties.
      void setExplain(boolean explain)
      Whether or not an explanation of the DML/PYDML program should be output to standard output.
      void setExplainLevel(MLContext.ExplainLevel explainLevel)
      Set the level of program explanation that should be displayed if explain is set to true.
      void setGPU(boolean enabled)
      Whether or not to enable GPU usage
      void setInit(boolean init)
      Whether or not to initialize the scratch_space, bufferpool, etc.
      void setMaintainSymbolTable(boolean maintainSymbolTable)
      Set whether or not all values should be maintained in the symbol table after execution.
      void setStatistics(boolean statistics)
      Whether or not statistics about the DML/PYDML program should be output to standard output.
      void setStatisticsMaxHeavyHitters(int maxHeavyHitters) 
      protected void setup(Script script)
      Sets the script in the ScriptExecutor, checks that the script has a type and string, sets the ScriptExecutor in the script, sets the script string in the Spark Monitor, and globally sets the script type.
      protected void showExplanation()
      Output a description of the program to standard output.
      protected void validateScript()
      Semantically validate the program's expressions, statements, and statement blocks in a single recursive pass over the program.
    • Field Detail

      • config

        protected org.apache.sysml.conf.DMLConfig config
      • dmlProgram

        protected org.apache.sysml.parser.DMLProgram dmlProgram
      • dmlTranslator

        protected org.apache.sysml.parser.DMLTranslator dmlTranslator
      • runtimeProgram

        protected org.apache.sysml.runtime.controlprogram.Program runtimeProgram
      • executionContext

        protected org.apache.sysml.runtime.controlprogram.context.ExecutionContext executionContext
      • script

        protected Script script
      • init

        protected boolean init
      • explain

        protected boolean explain
      • gpu

        protected boolean gpu
      • statistics

        protected boolean statistics
      • statisticsMaxHeavyHitters

        protected int statisticsMaxHeavyHitters
      • maintainSymbolTable

        protected boolean maintainSymbolTable
    • Constructor Detail

      • ScriptExecutor

        public ScriptExecutor()
        ScriptExecutor constructor.
      • ScriptExecutor

        public ScriptExecutor(org.apache.sysml.conf.DMLConfig config)
        ScriptExecutor constructor, where the configuration properties are passed in.
        Parameters:
        config - the configuration properties to use by the ScriptExecutor
    • Method Detail

      • constructHops

        protected void constructHops()
        Construct DAGs of high-level operators (HOPs) for each block of statements.
      • rewriteHops

        protected void rewriteHops()
        Apply static rewrites, perform intra-/inter-procedural analysis to propagate size information into functions, apply dynamic rewrites, and compute memory estimates for all HOPs.
      • showExplanation

        protected void showExplanation()
        Output a description of the program to standard output.
      • constructLops

        protected void constructLops()
        Construct DAGs of low-level operators (LOPs) based on the DAGs of high-level operators (HOPs).
      • generateRuntimeProgram

        protected void generateRuntimeProgram()
        Create runtime program. For each namespace, translate function statement blocks into function program blocks and add these to the runtime program. For each top-level block, add the program block to the runtime program.
      • countCompiledMRJobsAndSparkInstructions

        protected void countCompiledMRJobsAndSparkInstructions()
        Count the number of compiled MR Jobs/Spark Instructions in the runtime program and set this value in the statistics.
      • createAndInitializeExecutionContext

        protected void createAndInitializeExecutionContext()
        Create an execution context and set its variables to be the symbol table of the script.
      • setup

        protected void setup(Script script)
        Sets the script in the ScriptExecutor, checks that the script has a type and string, sets the ScriptExecutor in the script, sets the script string in the Spark Monitor, and globally sets the script type. Also does GPU initialization
        Parameters:
        script - the DML or PYDML script to execute
      • cleanupAfterExecution

        protected void cleanupAfterExecution()
        Perform any necessary cleanup operations after program execution.
      • restoreInputsInSymbolTable

        protected void restoreInputsInSymbolTable()
        Restore the input variables in the symbol table after script execution.
      • cleanupRuntimeProgram

        protected void cleanupRuntimeProgram()
        If maintainSymbolTable is true, delete all 'remove variable' instructions so as to maintain the values in the symbol table, which are useful when working interactively in an environment such as the Spark Shell. Otherwise, only delete 'remove variable' instructions for registered outputs.
      • executeRuntimeProgram

        protected void executeRuntimeProgram()
        Execute the runtime program. This involves execution of the program blocks that make up the runtime program and may involve dynamic recompilation.
      • initializeCachingAndScratchSpace

        protected void initializeCachingAndScratchSpace()
        Check security, create scratch space, cleanup working directories, initialize caching, and reset statistics.
      • globalDataFlowOptimization

        protected void globalDataFlowOptimization()
        Optimize the program.
      • parseScript

        protected void parseScript()
        Parse the script into an ANTLR parse tree, and convert this parse tree into a SystemML program. Parsing includes lexical/syntactic analysis.
      • rewritePersistentReadsAndWrites

        protected void rewritePersistentReadsAndWrites()
        Replace persistent reads and writes with transient reads and writes in the symbol table.
      • setConfig

        public void setConfig(org.apache.sysml.conf.DMLConfig config)
        Set the SystemML configuration properties.
        Parameters:
        config - The configuration properties
      • liveVariableAnalysis

        protected void liveVariableAnalysis()
        Liveness analysis is performed on the program, obtaining sets of live-in and live-out variables by forward and backward passes over the program.
      • validateScript

        protected void validateScript()
        Semantically validate the program's expressions, statements, and statement blocks in a single recursive pass over the program. Constant and size propagation occurs during this step.
      • checkScriptHasTypeAndString

        protected void checkScriptHasTypeAndString()
        Check that the Script object has a type (DML or PYDML) and a string representing the content of the Script.
      • getDmlProgram

        public org.apache.sysml.parser.DMLProgram getDmlProgram()
        Obtain the program
        Returns:
        the program
      • getDmlTranslator

        public org.apache.sysml.parser.DMLTranslator getDmlTranslator()
        Obtain the translator
        Returns:
        the translator
      • getRuntimeProgram

        public org.apache.sysml.runtime.controlprogram.Program getRuntimeProgram()
        Obtain the runtime program
        Returns:
        the runtime program
      • getExecutionContext

        public org.apache.sysml.runtime.controlprogram.context.ExecutionContext getExecutionContext()
        Obtain the execution context
        Returns:
        the execution context
      • getScript

        public Script getScript()
        Obtain the Script object associated with this ScriptExecutor
        Returns:
        the Script object associated with this ScriptExecutor
      • setExplain

        public void setExplain(boolean explain)
        Whether or not an explanation of the DML/PYDML program should be output to standard output.
        Parameters:
        explain - true if explanation should be output, false otherwise
      • setStatistics

        public void setStatistics(boolean statistics)
        Whether or not statistics about the DML/PYDML program should be output to standard output.
        Parameters:
        statistics - true if statistics should be output, false otherwise
      • setStatisticsMaxHeavyHitters

        public void setStatisticsMaxHeavyHitters(int maxHeavyHitters)
      • isMaintainSymbolTable

        public boolean isMaintainSymbolTable()
        Obtain whether or not all values should be maintained in the symbol table after execution.
        Returns:
        true if all values should be maintained in the symbol table, false otherwise
      • setMaintainSymbolTable

        public void setMaintainSymbolTable(boolean maintainSymbolTable)
        Set whether or not all values should be maintained in the symbol table after execution.
        Parameters:
        maintainSymbolTable - true if all values should be maintained in the symbol table, false otherwise
      • setInit

        public void setInit(boolean init)
        Whether or not to initialize the scratch_space, bufferpool, etc. Note that any redundant initialize (e.g., multiple scripts from one MLContext) clears existing files from the scratch space and buffer pool.
        Parameters:
        init - true if should initialize, false otherwise
      • setExplainLevel

        public void setExplainLevel(MLContext.ExplainLevel explainLevel)
        Set the level of program explanation that should be displayed if explain is set to true.
        Parameters:
        explainLevel - the level of program explanation
      • setGPU

        public void setGPU(boolean enabled)
        Whether or not to enable GPU usage
        Parameters:
        enabled - true if enabled, false otherwise

Copyright © 2017 The Apache Software Foundation. All rights reserved.