Introduction to Minitab

For The Stats Lab Computer Users

by Susan Chen
Department of Mathematics and Statistics
Simon Fraser University
This manual will introduce you to Minitab on the Stats Lab Macintosh computers. It is for students who have little experience with Minitab on a Mac computer. It may be most efficient to go through all of the procedures and exercises included in this introduction using Minitab.

A 3.5 inch floppy disk, either high density or double density, is required. They are available in the SFU Bookstore or the Quad Books. No computer account is necessary.

Table of Contents

Part I: Basics

1. Starting Minitab 2. Minitab windows 3. Entering, Editing and Saving Data 3.1 Entering Data (1) Entering data from the Data window: (2] Entering data from the Session window (3) Opening a Minitab Worksheet 3.2 Editing and Manipulating Data 3.3 Saving Worksheet 4. Saving the Session Window (Minitab Output) 4.1 Recording Session method 4.2 Save Window As... method 5. Printing Output and Quitting Minitab 5.1 Printing Session Window 5.2 Quitting Minitab 6. Editing and Printing Output 7. Ejecting your Disk

Part II: More Minitab Commands and Examples

1. Arithmetic Commands 11 2. Plotting Data 12 3. Basic Statistics 17 4. Regression 20 5. Analysis of Variance 24 6. Tables 26 7. Random Data and Distribution 27 References 30

1. Starting Minitab

The computer is turned on by pressing the button with the small triangle, which is located at the top of the keyboard. The computer is ready when the volumes Stats Lab #, and Aliases appear on the screen. Aliases contains all the frequently used applications such as Minitab, BBEdit and MSWord.

If your floppy disk is new, it may need to be formatted. Insert the disk and follow the instructions on the screen.

To start Minitab, move the mouse pointer onto Aliases and click the mouse button twice, quickly. (This is called a double click.) Then double click on Minitab Accelerated. Minitab is started when both the Session and Data windows are open. The session window will be hidden behind the data window.

2. Minitab Windows

Minitab has six types of windows which can all be open at the same time: Session, Data, Help, Info, History and Graph. All the windows except Help are listed under the Window menu; Help is listed under the Apple menu. A window can be made active by selecting it. For example, to make the Session window active, place the mouse pointer on the word Window, press and hold the mouse button and move it downwards to highlight Session, and then release the mouse button. Alternatively, just click it if it is visible on your screen. A window can be hidden (closed) by clicking on the small square on the upper-left corner. The active window appears in the front when more than one window is open at the same time.

The Session window, the Data window and the Help window are used most frequently and are discussed later. The other three windows are Info, History and Graph. Info summarizes the data in the current worksheet, History contains a record of previously executed commands, and a Graph window is usually created when you create a new graph.

Session Window

The Session window is used to enter Minitab commands and to display output. It has a menu bar on top and the usual Minitab prompt MTB> on the left. To enter a command, type it after the MTB> prompt and press return.

The Session window scrolls as more output goes into it. You can scroll up and down to see various parts of your output. However, the number of screens you can scroll back through is limited. It is usually between 5 and 15 screens. If your output is long and you want to save more of it, you should save your output continuously in an outfile. See section 4.1 Recording Session Method.

Data Window

The Data window, sometimes called the worksheet, displays active data entered by you or produced by the computer. You can enter, edit and view the data. The current cell is highlighted; others can be selected by clicking the mouse on them or using the arrow keys. The scroll bars are used to view different parts of the worksheet.

Help Window

The Help window gives on-line Help. For example, to find out what command to use to make a histogram, select HELP... from the Apple menu, double click on "COMMANDS by Name" from the list on the left-hand side of the screen. The header of the window will then be changed to COMMANDS by Name. Scroll down to find the command HISTOGRAM and double click on it. Minitab will respond with a brief explanation of HISTOGRAM on the right-hand side of the screen. To go back to the Minitab help window, select Minitab Help from the upper-left header. Click on Done to close Help.

Exercise 1 Open the five windows except Graph.

3. Entering, Editing and Saving Data

3.1 Entering Data

Data can be entered from the keyboard in the Data or Session windows.
Entering data from the Data window:
Example Enter the following father-son height data set into Minitab:
father's : 64 65 64 64 63 62 62 63 65 68 68 64 65 65 66 66 65 63 63 63
son's :    65 66 66 65 64 63 63 65 66 68 68 65 67 67 66 64 65 62 65 64

Select the Data window. Click on the small arrow in the upper left hand corner of the worksheet until the arrow is pointing down. Click the cell just below the column number C1; type the word father and press return to name C1 as father; type the first number 64 and press return. Continue until you have typed the last piece of the father's height data and pressed return. Similarly, we can enter the son's height into column C2 and name C2 as son.

(2) Entering data from the Session window:
Use the command set to enter data one column at a time, or the command read to enter more than one column at a time. The two methods are illustrated below. The MTB > and DATA> are Minitab prompts, while the bold characters are typed by the user.
------------------------------------------------------------------------------
MTB >set c3	
DATA>1 2 3 4 30 30 30 7 7	
DATA>end
------------------------------------------------------------------------------
MTB > read c4 c5
DATA>1  2
DATA>2  4	
DATA>3  6
DATA>4	8
DATA>end
-------------------------------------------------------------------------------
Entering patterned data from the Session window is rather simple. For example,
MTB > set c6
DATA> 1:4 3(30) 2(7)
DATA> end
puts 1 2 3 4 30 30 30 7 7 into column c6.

You can also use Set Patterned Data under Edit.

3.2 Editing and Manipulating Data

Always check that the data are entered correctly before proceeding further. A cell in the worksheet is made active by clicking on it. An active row is a row in which a cell is active. You may find the following helpful when editing a worksheet:
           
To enter a new value into an active cell
 
           
type the value and press return. It overwrites the previous contents of the cell
 
           
To correct the active cell
 
           
type the correct data and press return.
 
           
To delete the active cell
 
           
choose Editor > Delete Cell.
 
           
To delete the active row
 
           
choose Editor > Delete Row.
 
           
To insert one cell above the active cell
 
           
choose Editor > Insert Cell.
 
           
To insert one row above the active cell
 
           
choose Editor > Insert Row.
 
           
To repeat the last insertion or deletion
 
           
choose Editor > Repeat.
 
           
To restore the previous value of the cell
 
           
choose Edit > Undo Change Within Cell.
 
           
To erase a column (variable)
 
           
choose Calc > Erase variables, type the column number and press return.
 



Minitab handles both numeric (number) and alpha data (words). However, a column cannot contain both alpha and numeric data. Once alpha data is entered into a column, e.g., C1, the column header C1 will be altered to C1-A and C1-A cannot be used for computation. If you accidentally create an alpha column, C1-A, you cannot convert it back into numeric format. You must delete the column by typing the command ERASE C1 and re-enter the data.

3.3 Saving Worksheets

It is a good practice to save the worksheet when you finish entering the data. This helps avoid having to retype the data should the computer crash. For example, to save the father-son worksheet onto your floppy disk named mydisk, choose File > Save Worksheet As...; type a name, e.g., father-son, click on Desktop if you don't see mydisk in the box, double click on mydisk , and finally click on Save. The worksheet is then saved as father-son.MTW , where .MTW is an automatic extension to Minitab worksheet. Note that Save Worksheet As... only saves the Data window, not the Session window.

3.4 Opening Worksheets

To open a worksheet, choose File > Open Worksheet..., then locate the data file. (The notation File > Open Worksheet... means select Open Worksheet from the File menu). For example, to open a worksheet named pulse.MTW located in SLS 0c Applications > Statistics > Minitab 8.2 > Data, choose File > Open Worksheet..., scroll down to Desktop, double click on SLS 0c Applications, then on Statistics, Minitab 8.2, Data, and then on pulse.MTW.

4. Saving the Session Window (Minitab Output)

There are two ways to save the Session window. The Recording Session method saves everything that appears in the Session window as it is typed. This is the recommended method. The Save Window As... method allows you to save the part of the session that you can scroll back through on the screen, or a selected part of your session. This is convenient if only a part of the session is to be saved or the output is short. Files saved in either way can be imported into a text editor (e.g., BBEdit) or a word processor (e.g., MSword) to be edited. In the following, I will refer to your floppy disk as mydisk.

4.1 Recording Session method

The Recording Session method creates a file (outfile) to store everything that appears in the Session window. An outfile has to be created at the beginning of a Minitab session. To start an outfile, choose File > Other Files > Start Recording Session, click on Select File, then click on Desktop if you don't see mydisk in the box, double click on mydisk, and finally type a name (e.g., father-son) in the box provided under Record Session as and click on Save. Minitab will add the extension .LIS to the given name. The default name is Minitab.LIS. From this point on, all commands and output that appear in the Session window will be stored in the file until you tell computer to stop recording. To stop recording a session, choose File > Other Files > Stop Recording Session. You can stop and start recording a session as you choose.

One main advantage of this method is that the current session can be appended to previous sessions, if you store it to the same outfile each time. This allows you to accumulate your computer outputs.

4.2 Save Window As... method:

After the Minitab session to be saved is produced, choose File > Save Window As.... To save a specific part of the Session window, first highlight the portion of the Session window to be saved (click at the front of the first character to be selected, press and hold the mouse button and drag the mouse to the last character to be selected, then release the button). Then choose File > Save Selection As.... (Save Window As... becomes Save Selection As... when a part of the Session window is selected.) Exercise 2: Start an outfile called father-son as described above in section 4.1. Open the worksheet father-son.MTW saved in section 3.1 if it is not open. Type the following Minitab commands in the Session window and press return after each command:

describle c1
histogram c1
plot 'son' 'father'
boxplot c2.
Then stop recording as described above in 4.1. The following (except for the notes following the <-- symbols) will appear in your Session window.
---------------------------------------------------------------------------
MTB > OutFile 'father-son'. <-- choose File > Other Files > Start Recording
Session
MTB > Retrieve 'father-son.MTW' <-- choose File > Open Worksheet...
 WORKSHEET SAVED  5/19/1995
Worksheet retrieved from file: father-son.MTW
MTB > describe c1 
                N     MEAN   MEDIAN   TRMEAN    STDEV   SEMEAN
father         20   64.400   64.000   64.333    1.698    0.380
              MIN      MAX       Q1       Q3
father     62.000   68.000   63.000   65.000

MTB > histogram c1
Histogram of father   N = 20
Midpoint   Count
      62       2  **
      63       5  *****
      64       4  ****
      65       5  *****
      66       2  **
      67       0
      68       2  **

MTB > plot 'son' 'father'
     68.0+                                                      2
         -
 son     -
         -                             2
         -
     66.0+                    *        2       *
         -
         -
         -            2       3        *
         -
     64.0+            2                        *
         -
         -
         -    2
         -
     62.0+            *
         -
           ------+---------+---------+---------+---------+---------+father

              62.4      63.6      64.8      66.0      67.2      68.4

MTB > boxplot c2

                             ------------------
             ----------------I        +       I----------------- 
                             ------------------
          ------+---------+---------+---------+---------+---------+son     
             62.4      63.6      64.8      66.0      67.2      68.4
MTB > Nooutfile <-- choose File > Other Files > Stop Recording 

5. Printing Output and Quitting Minitab

You can print on the Stats Lab DeskWriter or on any University LaserWriter from a computer in the stats lab. The latter produces better quality print at a cost of 5 cents per sheet, one-sided or two-sided.

To print on the Stats Lab DeskWriter, choose Chooser from the apple menu, click on the DeskWriter icon, then math.sfu.ca in the Zone window and Stats Lab DeskWriter in the printer window. These are usually highlighted automatically. Close the chooser by clicking on the little square on the upper-right corner of the chooser window. Then follow the instructions in Section 5.1 for sending a job to the printer.

To print on a university LaserWriter, follow the steps below:

Note: To print on a university printer, you will need an SFU Printing Card, available from the green and black vending machines in the MCF and the Library. The card costs one dollar and must be re-inserted and additional dollar coins inserted to add printing credits to it. Be sure to remove your card from the card reader when you are finished printing. The Data window cannot be printed. However, the data may be printed by displaying it in the Sessionwindow first then printing the Session window. To display data in the Session window, e.g., columns C1 to C10 and constant K1, type the command print C1-C10 K1. Then you can select the displayed data and print the selection as described below.

5.1 Printing Session Window

5.2 Quitting Minitab

Type STOP or select File > Quit to exit Minitab. Quitting Minitab automatically stops outfile recording.

6. Editing and Printing Output

Minitab output cannot be edited within a Minitab session. However, an output file can be opened and edited in a text editor such as BBEdit:

  • double click on Aliases , which usually appears on the lower-right side of the screen;
  • double click on BBEdit icon to launch BBEdit. A letter B will appear on the upper-right corner of the screen when BBEdit is opened;
  • click on the word Desktop to bring up the floppy disk that contains the output file;
  • double click on the disk name and then the file name to open the file;
  • edit your file as you wish;
  • select Save under the File menu to save changes;
  • select Print ... under the File menu if you wish to print the revised output;
  • select Quit under the File to quit BBEdit when you finish. MSWord could also be used to edit the file.

    7. Ejecting your Disk

    Before you leave the computer, please close all the folders you have opened. To close a folder, click on the little square in the upper-left corner. To eject your disk, place the mouse pointer on the top of your disk icon that appears on the computer screen, press the mouse button and hold and drag it to the Trash that is usually located at the lower-right corner of the screen. Release the mouse button when Trash becomes dark. Do not use eject disk under the Special menu. Select Special > Restart to restart the computer.

    See a TA in the lab if you have any questions regarding the use of Minitab in the Stats Lab.

    Practice Problem: Open the worksheet employ.MTW from Desktop > SLS 0c Applications > Statistics > Minitab8.2 > Data. Use the Minitab commands introduced in Section 4 to find the mean of Trade, Food and Metals. Are the histograms for Trade, Food and Metals symmetric? Does the scatter plot of Metals vs. Trade show a positive association between the two variables?

    Part II: More Minitab Commands and Examples

    In this part, some frequently used Minitab commands by topics, and examples illustrating some of the commands will be included. All the commands included in this booklet and more can be found in the Help window. For illustrating purpose, a few data sets (worksheets) will be used. All the data sets except the father-son data set used are within Minitab program and can be found on the computer. Worksheets employ.MTW , fa.MTW, and pulse.MTW can be opened by selecting File > Open Worksheet > Data > 'file name' (e.g., fa.MTW) in Minitab. They are stored in the folder Data . Worksheet Poplar2.MTW can be found under File > Open Worksheet > Quick Start; and anom.MTW can be opened under File > Open Worksheet >ANOM.

    All Minitab commands can be typed in the Session window, and most of them can also be selected from the pop-up menu. Minitab commands in this part are presented here will be typed into the Session window. Commands except those in the examples are underlined. In a command line only the bold-faced characters are necessary; characters within [] are optional. Commands are not sensitive to the case of letters. Also, for a command, the first four characters are sufficient. That is, Correlation C1 C2 and corr c1 c2 are the same.

    1. Arithmetic Commands

    The arithmetic operations are +, -, *, /, and ** for addition, subtraction, multiplication, division and power, respectively. () is often used in algebraic expressions as well. You may open a Minitab worksheet called fa.MTW located in the Data folder in Minitab to practice commands in this section.

    
    let E= expression 	computes an algebraic expression. For example,
    
    MTB> LET C3 = 2*(C1+C2)	<--- C3 is the double of the sum of C1 and C2
    MTB > let k1=mean(C1)	<--- k1 is the mean of c11
    MTB > let k2=C1(2)	<--- k2 is the second element of column C1
    
    count, sum, mean, median, stdev, minimum, maximum, sqrt and ssq each
    computes a statistic for a column. For example,
    
    MTB > mean c1  <--- computes the mean value of column c1
    MTB > count c1 <--- counts how many pieces of data in c1
    MTB > stdev c1 <--- computes the sample standard deviation of column c1
    MTB > ssq c1   <--- computes the sum of squared values in c1
    
    Accordingly, rcount, rsum, rmean, rmedian, rstdev, rminimum, rmaximum,
    rsqrt and rssq each computes a statistic rowwise. For example,
    
    MTB > rmean c1-c3 c11 
    MTB > rsum c1-c3 c12
    MTB > rssq c1-c3 c13
    
    c1	c2	c3		c11	c12	c13
    2	3	4	--->	3	9	29
    3	4	5		4	12 	50
    
    indicator variables for values in c, put into c...c  creates indicator or
    dummy variables. For example,
    
    MTB > indicator 'sex'  c21 c22
    
    'sex'	c21	c22
    1	1	0
    1  ---->1	0
    2	0	1
    2	0	1
    sort c...c put into c...c  to sort by the first column and carry along
    additional columns. 
    
    Sorting by multiple columns can be done with the by c...c  subcommand.
    Sorting is done in ascending order unless you use the subcommand
    descending. To see how command sort works, open worksheet in Minitab8.2 >
    Data, type command 
    sort c1-c3	c11-c13 to see what happens.
    
    rank the values in c put ranks into c ranks the smallest number as 1, the
    second smallest number as 2 and so on, ties are assigned the average rank.
    For example,
    MTB > rank c1 c2
    	c1		c2
    	1		2.5
    	1.5	--->	4
    	0		1
    	2		5
    	1		2.5
    
    

    2. Plotting Data

    Most commands for plotting data have subcommands. Minitab produces output by default unless you use appropriate subcommands. To use a subcommand, put a semicolon at the end of the main command line. This tells computer that subcommands will follow. Start each subcommand on a new line and end each line with a semicolon. When you are finished, end the last subcommand with a period. In this booklet, subcommands will be listed right below the underlined main command with a few exceptions. Subcommands can be used one at a time or many at a time.
    histogram c...c  	prints a separate histogram for each column.
    	increment	= k	specifies interval width to be k
    	start	= k 	specifies the first midpoint to be value k 
    	by c	produce a separate histogram for each value in c.
    	same	the same scale will be used for all columns listed on
    histogram
    
    For example,
    MTB > histogram 'father';
    SUBC> start = 62;
    SUBC> increment = 2.
    
    Histogram of father   N = 20
    Midpoint   Count
       62.00       2  **
       64.00       8  ********
       66.00       7  *******
       68.00       3  ***
    
    dotplot c...c 	produces a separate dotplot for each column
    	increment=k	specifies the distance between tick marks on the
    			axis.
    	start = k	specifies the first tick mark
    	by c	produce a separate histogram for each value in c.
    	same	the same scale will be used for all columns listed on
    dotplot
    For example,
    MTB > dotplot 'father'
    
                        .                .
                        :       :        :
                :       :       :        :       :                :
              -----+---------+---------+---------+---------+---------+father  
                62.4      63.6      64.8      66.0      67.2      68.4
    
    
    stem-and-leaf c...c  produces a separate stem-and-leaf display for each
    column. Open Help window to understand a stem-and-leaf display. It is too
    lengthy to explain here.
    
    For example,
    MTB > stem-and-leaf 'father'
    Stem-and-leaf of father    N  = 20
    Leaf Unit = 0.10
    
        2   62 00
        7   63 00000
       (4)  64 0000
        9   65 00000
        4   66 00
        2   67 
        2   68 00
    
    boxplot c 	produce a boxplot.
    	by c	one boxplot is produced for each level given in c. For
    example,
    MTB > Retrieve 'pulse.MTW'
     WORKSHEET SAVED  4/ 1/1991
    
    Worksheet retrieved from file: pulse.MTW
    MTB > boxplot 'pulse1';
    SUBC> by 'sex'.
            
    SEX     
    
                                -----------
    1           *     ----------I     +   I-------------- * * 
                                -----------
    
                                   -------------------
    2                     ---------I          +      I--------------- 
                                   -------------------
              ----+---------+---------+---------+---------+---------+--PULSE1  
                 50        60        70        80        90       100
    
    plot c versus c   prints a scatter diagram with the first column on the
    vertical (y) axis and the second column on the horizontal (x) axis. Each
    point is plotted with the symbol * ordinarily. When two or more points fall
    on the same spot, a count is printed. When the count is over nine, the
    symbol + is used. Plot  has the following subcommands to control labeling
    and to specify symbol and scales:
    
    	title	='text'
    	footnote	='text'
    	xlabel	='text'
    	ylabel	='text'
    	symbol	='symbol'
    	xincrement	=k
    	xstart	=k
    	yincrement	=k
    	ystart	=k		For example,
    
    MTB > plot c2 c1;
    SUBC> title='son vs. father';
    SUBC> symbol = 'h'.
    
      son vs. father
         68.0+                                                      2
             -
     son     -
             -                             2
             -
         66.0+                    h        2       h
             -
             -
             -            2       3        h
             -
         64.0+            h                        h       h
             -
             -
             -    2
             -
         62.0+            h
             -
               ------+---------+---------+---------+---------+---------+father
    
                  62.4      63.6      64.8      66.0      67.2      68.4
    
    mplot c vs. c,...,c vs. c plots several pairs of columns on the same axes.
    The m is for multiple. The first pair of columns is plotted with the symbol
    A, the second pair with B, and so on. If several points fall on the same
    spot, a count is given. Up to nine pairs of columns in one mplot may be
    plotted. Mplot share the same subcommands, except 'symbol', with plot.
    Example,
    
    MTB > Retrieve 'fa.MTW'
    WORKSHEET SAVED  4/ 1/1991
    Worksheet retrieved from file: fa.MTW
    
    MTB > mplot 'y1' 'x' 'y2' 'x' 'y3' 'x'
    
             -
             -                                                  C
             -
             -
         10.5+                                             A
             -                                                       A
             -                              2    B    B    B         C
             -                         B              A    C    B    B
             -                                   2    C         A
          7.0+               A    B    2    C
             -               2    C
             -     C    2
             -          B         A
             -     A
          3.5+
             -     B
             -
               ----+---------+---------+---------+---------+---------+--
                 4.0       6.0       8.0      10.0      12.0      14.0
            A = Y1 vs. X           B = Y2 vs. X           C = Y3 vs. X
    
    lplot c vs. c using letters as coded in c  plots data using letters for
    plotting symbols. The l is for letter. As in plot, the first column is the
    vertical (y) axis and the second column is the horizontal (x) axis. Each
    point is plotted with a letter which is determined by the number in the
    last column, using the following correspondence:
    
    ... -2 -1 0 1  2  3. .. 2 3  24 25 26 27 28 29...
    ...  X Y Z A B C ...V W X  Y  Z   A  B  C...
    
    If several points fall on the same spot, a count is printed. Lplot has the
    same eight subcommands as that for mplot, to control labeling and to
    specify scales. For example,
    
    MTB > Retrieve 'pulse.MTW'
     WORKSHEET SAVED  4/ 1/1991
    
    Worksheet retrieved from file: pulse.MTW
    MTB > lplot 'height' 'weight' 'sex'
    
             -
         75.0+                               A         A A
             -                          A  A 2       A   2
     height  -                             3   A A   A   A
             -                        AA A A   A   A A     A       A
             -                      AA   A A     2
         70.0+                 B A       2AAA
             -                     A   2 3 A A   A A
             -           B B   B BB B  A A 2
             -             B  AB     A A 2
             -               2 B 3 2 A
         65.0+             BB B    B
             -        B        B
             -     B      BBBB
             -          2B   B   B
             -                       B
         60.0+
               ------+---------+---------+---------+---------+---------+weight
    
                   100       125       150       175       200       225
    
    where B represents female and A represents male.
    
    tsplot [period =k] c does a time series plot with a column of data which
    often are observations made at equally spaced intervals in time (y axis)
    versus the integers 1,2,3...which indicates the times when the observations
    obtained(x axis). 
    
    Tsplot plots data using special symbols to indicate a cycle. All you need
    to do is to specify the length of the cycle called period k in the tsplot
    command. For example, if the data are collected monthly, then the period k
    = 12. By identifying k=12, tsplot will plot each observation from January
    with a "1", from February with a "2",...., September with a "9", October
    with a "0", November with an "A" and December with a "B".
    
    If the period k is not specified, period 10 is assumed, and plotting
    symbols 1, 2,..., 9, 0, 1,... are used.
    
    Subcommands for tsplot:
    increment=k, start=k[end=k] to specify the scale for the y axis. 
    origin=k to specify the time value associated with the first observation.
    For example, if origin = 1940 is used, then the first observation on the
    plot will be labeled 1940 on the time (x) axis, the second 1941, and so on.
    
    tstart=k [end = k] allows you to plot a subset of your time series. For
    example, if tstart = 5 is used, then the first  observation plotted is the
    5th observation and the first 4 observations are omitted from the plot.
    
     mtsplot [period k] c...c   plots several time series all on the same axes.
    
    High-Resolution Graphics 
    Here are most of the High-Resolution Graphics commands  implemented in
    Minitab:
    Ghistogram C...C
    	increment = k
    	start = k
    	by c
    	same scales for all columns
    
    Gboxplot C
    
    Gplot C C
    Gmplot C vs. C,..., C vs. C 
    Glplot C C C
    The last three commands share the same subcommands:
    	title = 'text'
    	footnote = 'text'
    	xlabel = 'text'
    	ylabel = 'text'
    	
    	xincrement = k
    	xstart = k
    	yincrement = k
    	ystart = k
    
    	lines   [style = k ] connecting pairs in  c c 
    	symbol = 'symbol' (only works with gplot)
    
    As you might have noticed that the subcommand lines is what the ordinary
    plotting commands do not have. For example, the following commands produce
    Picture 6 in Part I. 
    
    MTB > regress c2 1 c1 c3 c4 
    MTB > name c4 'fit-son'
    MTB > gmplot c2 c1 c4 c1;
    SUBC> lines c4 c1.
    
    Note: High-resolution graphics are each displayed in a separate graph
    window. You can print the active high-resolution graphic window by
    selecting Print window... from the File menu. They cannot be saved with the
    session window output, however, they can be saved as picture files which
    can be inserted to other software files such a MSWord file. 
    

    3. Basic Statistics

    Due to page limit of this booklet, computer outputs will not be explained
    in much detail here. Please use Help... or ask a TA to understand them
    better.
    
    	describe	zinterval	ztest	tinterval	ttest	
    
    	twosample	twot	correlation	covaiance	centre
    
    
    describe c..c	prints ten descriptive statistics for each column
    	by c	separate statistics are produced for each value in c. The
    values in c
    	must be between -9999 to +9999. For example,
    
    Worksheet retrieved from file: pulse.MTW
    MTB > describe 'pulse1';
    SUBC> by 'sex'.
    
                  SEX        N     MEAN   MEDIAN   TRMEAN    STDEV   SEMEAN
    PULSE1          1       57    70.42    70.00    70.27     9.95     1.32
                    2       35    76.86    78.00    76.65    11.62     1.96
    
                  SEX      MIN      MAX       Q1       Q3
    PULSE1          1    48.00    92.00    63.00    75.00
                    2    58.00   100.00    66.00    86.00 	
    
    zinterval [k% confidence], sigma = k for c...c      calculates a k% 
    	confidence interval for the mean, separately for each column. If k
    is not 	specified, then a 95% confidence interval for the population mean
    will be 	calculated. For example,
    MTB > zinterval 95 10 'pulse1'
    
    THE ASSUMED SIGMA =10.0
    
                 N      MEAN    STDEV  SE MEAN   95.0 PERCENT C.I.
    PULSE1      92     72.87    11.01     1.04  (   70.82,   74.92)
    
    ztest [mu=k] sigma=k for c...c	performs a separate z-test on each column.
    	alternative = k	k=-1 gives H1: mu mu0.
    
    If mu is not specified, then H0: mu = 0 is used. If subcommand alter =k is
    not used, a two-sided (H1: mu - mu0) ztest will be done. For example,
    
    MTB > ztest 75 10 'pulse1';
    SUBC> alternative = -1.
    
    TEST OF MU = 75.00 VS. MU L.T. 75.00
    THE ASSUMED SIGMA = 10.0
    
                 N      MEAN    STDEV   SE MEAN        Z    P VALUE
    PULSE1      92     72.87    11.01      1.04    -2.04      0.021 
    
    The output P-value =0.021 tells us that we can reject H0: mu=75 at 5% level
    and conclude that the population mean is significantly less than 75.
    
    tinterval [k% confidence] for c...c	calculates a separate k% confidence
    	 for each column. If k is not specified, then a 	95%
    confidence interval is calculated. 
    ttest [mu=k] for c...c	performs a separate t-test on each column.
    	alternative k	k=-1 gives H1: mu mu0.
    If mu is not specified, then H0: mu = 0 is used. If subcommand alter =k is
    not used, a two-sided (H1: mu - mu0) ttest will be done.
    
    MTB > ttest 75 'pulse1';
    SUBC> alter=-1.
    TEST OF MU = 75.00 VS. MU L.T. 75.00
    
                 N      MEAN    STDEV   SE MEAN        T    P VALUE
    PULSE1      92     72.87    11.01      1.15    -1.86      0.033
    
    The P-Value of 0.033 suggests that there is significant evidence against
    H0: mean of pulse1 is 75, the test result is in favor of the alternative
    that the mean of pulse1 is less than 75.
    
    Note: zinterval and ztest need population standard deviation sigma,  but
    tinterval and ttest don't. The degrees of freedom of the t-statistics is
    n-1, n is the sample size.
    
    twosample [k% confidence] for c1 c2  Does a two sample t-test H0: mu1 =
    mu2, and calculates a confidence interval for (mu1-mu2). If 	k is not
    specified, a 95% confidence interval is calculated. 
    alternative= k		k=-1 gives H1: mu1 mu2.
    pooled	the common variance is estimated by the pooled variance under
     	the assumption that the two populations have the same variance.
    
    If subcommand alternative= k is not used, a two-sided (H1: mu1 - mu2)
    twosample t-test will be done. For example,
    MTB > twosample c1 c2
    
    TWOSAMPLE T FOR father VS. son
             N      MEAN     STDEV   SE MEAN
    father  20     64.60      1.76      0.39
    son     20     65.20      1.61      0.36
    
    95 PCT CI FOR MU father - MU son: (-1.68, 0.48)
    
    TTEST MU father = MU son (VS. NE): T= -1.13  P=0.27  DF=  37
    
    The degrees of freedom of the t-statistic used in a non-pooled test and
    confidence interval is given by :
    
    df=
    f((VAR1+VAR2)2, [VAR12/(n1-1)]+[VAR22/(n2-1)])
    
    where VAR1=s12/n1, VAR2 =s22/n2. Minitab truncates the number to an
    integer, when it is  necessary.
    
    When Pooled is used, the t-statistic has a degrees of freedom of n1+n2-2.
    
    twot [k% confidence] data in c, groups in c
    	alternative= k	
    	pooled	
    
    does exactly the same test and confidence interval as twosample. The only
    difference is the form of the data. Twot expects the data for both groups
    in the first column, and the group codes that specifies which group each
    observation belongs to in the second column.
    Group codes must be integers between -10,000 to 10,000 or the missing data
    code *.
    It is convenient to use twot when data for two groups are mixed together.
    
    correlation c...c [put in m]  calculates linear correlation coefficient for
    all pairs of columns, and stores them into a matrix m, optionally. 
    
    MTB > Retrieve 'pulse.MTW'
    MTB > correlation 'PULSE1' 'PULSE2' 'HEIGHT' 'WEIGHT'
    
             PULSE1   PULSE2   HEIGHT
    PULSE2    0.616
    HEIGHT   -0.212   -0.143
    WEIGHT   -0.202   -0.169    0.785
    
    says that the linear correlation coefficient between WEIGHT and HEIGHT is
    0.785, etc.
    
    covariance c...c [put in m]  calculates covariance for all pairs of
    columns, and store them into a matrix m, optionally. For example,
    
    centre c..c out into c...c	standardizing each column into z-scores
    	by subtracting its mean and dividing by its standard deviation,
    	when no subcommands.
    	
    	location 	[subtracting k...k] when no k's are given, each
    			column is transformed by subtracting its mean.
    	
    	scale 		[dividing by k..k] when no k's are given, each
    			column is transformed by dividing by its standard
    			deviation.
     
    	minmax		[ min=k, max=k]	when no k's are given, all columns are 
    			transformed to have minimum -1 and maximum +1.
    
    

    4. Regression:

    Simple, Multiple, and Polynomial Regression
    (1) simple regression
    h6: with one predictor (X) regress c on 1 predictor c [put standard resd. in c [fits in c]]
    (2) multiple regression:
    with more than one predictor (X1, X2, ...Xp) regress c on k predictors c...c [put standard resd. in c [fits in c]]
    (3) polynomial regression:
    with more than one predictor (X, X2, X3...) regress c on k predictors c1...ck [put st. resd. in c [fits in c]]

    where c2=c1**2, c3=c1**3,..., ck=c1**k

    Some subcommands for Simple, Multiple, and Polynomial Regression:
    	noconstant	fit equation without constant term
    	coefficients	put into c	stores the coefficients b0,
    					b1,...,bk into column c
    	residual	put into c	stores the residuals in column c
    	predict	for E...E computes fitted Y's for given values of X
    	mse	put into k	stores mean square error into k. 
    	hi	put into c	stores leverages into c
    	cooked	put into c	store Cook's distance into c.
    	vif	prints variance inflation factor associated with each
    			predictor.
    	pure	prints the results of the usual pure error test for
    			lack of fit
    
    More subcommands can be found from apple menu >Help....> commands by Name >
    Regression.  Here is an example of simple regression and a short
    interpretation of its  output.
    
    Worksheet size: 123723 cells
    MTB > Retrieve 'pulse.MTW'
    WORKSHEET SAVED  4/ 1/1991
    
    Worksheet retrieved from file: pulse.MTW
    MTB > regress 'pulse1' 1 'height';
    SUBC> residual c11.
    -------------------------output part 1----------------------------------
    The regression equation is
    PULSE1 = 117 - 0.637 HEIGHT    <---
    o(Y,^) = b0+b1X,   where 
    o(Y,^) is called fitted value.
    
    Predictor       Coef       Stdev    t-ratio        p
    Constant      116.65       21.33       5.47    0.000
    HEIGHT       -0.6372      0.3099      -2.06    0.043
    
    s = 10.82       R-sq = 4.5%      R-sq(adj) = 3.4%
    -------------------------output part 2---------------------------------
    Analysis of Variance
    
    SOURCE       DF          SS          MS         F        p
    Regression    1       494.7       494.7      4.23    0.043
    Error        90     10533.8       117.0
    Total        91     11028.4
    -------------------------output part 3---------------------------------
    Unusual Observations
    Obs.  HEIGHT    PULSE1       Fit Stdev.Fit  Residual   St.Resid
     29     63.0    100.00     76.51      2.10     23.49      2.21R 
     31     68.0     96.00     73.33      1.15     22.67      2.11R 
     54     68.0     48.00     73.33      1.15    -25.33     -2.35R 
    
    R denotes an obs. with a large st. resid.
    
    MTB > name c11 'resid'
    MTB > print c11
    resid    <--- resid = y - f(SS Regression, SS Total)
    R-sq(adj) = 3.4% is R2 adjusted for degrees of freedom, and calculated by 
    R2 (adj) = 1-  Data,
    execute the following commands, and then compare your output with the
    output in the example above. What are the same? What are different? This is
    a multiple regression example. 
    
    MTB > regress 'pulse1' 2 'height' 'weight';
    SUBC> residual c11;
    SUBC> mse k1;
    SUBC> coefficient c12.
    MTB >	name c11 'resid' c12 'coefft'
    MTB > print k1
    MTB > print c12
    MTB > print c11
    
    As you will have noticed that the Analysis of Variance table includes the
    sequential sums of squares. Here is a short note of it. 
    
    SOURCE       DF      SEQ SS	<--- sequential sums of squares
    HEIGHT        1       494.7 	<---SS(b1|b0), an addition in SSR due to
    adding
     	'height' to the constant model
    WEIGHT        1        37.2 	<--- SS(b2|b0, b1), an addition in SSR due
    to adding
    	'weight' to the model bo(y,^) = b0+b1'height'
    
    Best Subset Regression:   breg  c on predictors c...c
    does best subset regression using the maximum R2 criterion. Suppose you
    specify m predictors on command breg. Breg first looks at all one-predictor
    models and chooses the model with the largest R2. Four statistics (R2, adj
    R2, cp and s) on this and the next best model is printed. Then Breg looks
    at all two-predictor models and selects the model with the largest R2, and
    prints information on this and the next best model. This process stops when
    all m predictors are used.
    
    When comparing models with the same number of predictors, choosing the
    model with the highest R2  is equivalent to choosing the model with the
    smallest SSE. When comparing models with different number of predictors,
    choosing the model with the highest adjR2 is equivalent to choosing the
    model with the smallest MSE. In general, we look for model which has cp
    small and close to p, the number of parameters in the model. Here is an
    example of Breg:
    
    MTB > breg 'pulse1' 'height' 'weight' 'activity'
    Best Subsets Regression of PULSE1
                                              A 
                                              C 
                                          H W T 
                                          E E I 
                                          I I V 
                                          G G I 
                  Adj.                    H H T 
    Vars   R-sq   R-sq    C-p         s   T T Y 
    
       1    4.5    3.4    0.6    10.819   X     <--- best one-predictor model
       1    4.1    3.0    0.9    10.841     X <--- the next best one-predictor
    model
       2    4.8    2.7    2.3    10.860   X X   <---best two-predictor model
       2   4.7    2.6    2.4    10.867   X   X <--the next best two-predictor
    model
       3    5.1    1.9    4.0    10.906   X X X <--- model with all three
    predictors
    
    Each line of the output represents a different model. Vars is the number of
    variables (predictors) in the model, R-sq is R2. A predictor in the model
    is indicated by an "X".
    
    Subcommands of breg:
    include c...c in all models  the specified columns are included as 
    			predictors in all the models.
    best	k models	to print the 'bet' k models of each size,
    			the default  is 2
    nvars  k[k] 	nvars 2  4 tells computer to only print the best 2,
    			3, 4 predictor  models
    noconstant	the constant term is omitted from the model
    
    
    

    5. Analysis of Variance

    aovoneway on c...c
     
    does a one-way analysis of variance. Data for each group (level) are to be
    put into a separate column. 
    
    oneway aov, data in c, levels in c [put resids in c[fits in c]]
    
    	Tukey	[family error rate k]
    	Fisher	[individual error rate k]
    	Dunnett	[family error rate k] control level is k
    	MCB	[family error rate k] best is k
    
    is similar to aovoneway, but all data are in the first column and
    corresponding levels are in the second column; and that multiple
    comparisons can be made with subcommands. Tukey and Fisher provide
    confidence intervals for all pairwise differences between level means.
    Dunnett provides a confidence interval for the difference between each
    treatment mean and a control mean. MCB provides a confidence interval for
    difference between each level mean and the best of the other level means.
    In MCB, there are two choices for best, when the smallest mean is
    considered the best, set k = -1; set k=1 when the largest mean is
    considered the best.
    
    An example of oneway and aovoneway:
    
    MTB > Retrieve 'poplar2.MTW'
    WORKSHEET SAVED  3/13/1991
    Worksheet retrieved from file: poplar2.MTW < QuickStart < Minitab 8.2
    Note: -99.00 was used to represent missing value in this data set. Change
    -99.00 in c4 into * which stands for missing value in Minitab, before
    preceding with the following commands. 
    
    MTB > oneway c4 c3;
    SUBC> tukey 0.05.
    
    ANALYSIS OF VARIANCE ON Diameter
    SOURCE     DF        SS        MS        F        p
    Treatmnt    3     54.76     18.25     6.67    0.000
    ERROR     291    796.81      2.74
    TOTAL     294    851.57
                                       INDIVIDUAL 95 PCT CI'S FOR MEAN
                                       BASED ON POOLED STDEV
     LEVEL      N      MEAN     STDEV  --+---------+---------+---------+----
         1     74     4.652     1.682     (------*-----) 
         2     75     4.918     1.749          (-----*-----) 
         3     74     4.471     1.591  (------*-----) 
         4     72     5.613     1.588                     (------*-----) 
                                       --+---------+---------+---------+----
    POOLED STDEV =    1.655            4.20      4.80      5.40      6.00
    
    Tukey's pairwise comparisons
        Family error rate = 0.0500
    Individual error rate = 0.0107
    Critical value = 3.63
    Intervals for (column level mean) - (row level mean)
    
                   1         2         3
         2    -0.962
               0.430
    
         3    -0.517    -0.248
               0.880     1.143
    
         4    -1.664    -1.395    -1.845
              -0.258     0.006    -0.439
    
    To apply command aovoneway on to the same data set, use command unstack
    with subcommand subscripts to separate data for different level into
    different columns, before applying command aovoneway.
    
    MTB > unstack c4;
    SUBC >  subscripts c3.
    MTB > aovoneway c11-c14
    ANALYSIS OF VARIANCE
    SOURCE     DF        SS        MS        F        p
    FACTOR      3     54.76     18.25     6.67    0.000
    ERROR     291    796.81      2.74
    TOTAL     294    851.57
                                       INDIVIDUAL 95 PCT CI'S FOR MEAN
                                       BASED ON POOLED STDEV
     LEVEL      N      MEAN     STDEV  --+---------+---------+---------+----
    C11        74     4.652     1.682     (------*-----) 
    C12        75     4.918     1.749          (-----*-----) 
    C13        74     4.471     1.591  (------*-----) 
    C14        72     5.613     1.588                     (------*-----) 
                                       --+---------+---------+---------+----
    POOLED STDEV =    1.655            4.20      4.80      5.40      6.00
    
    twoway aov, data in c, levels in c c [put resids in c[fits in c]]
    	additive model	<--- to fit a model without the interaction term
    	mean   for factors c [and c]   <--- prints marginal means and 95%
    C.I., 
    	separately for each factor specified.
    does a two-way analysis of variance for balanced data (equal number of obs.
    in each cell)
    
    An example of twoway
    MTB > Retrieve 'anom.MTW'
    MTB > twoway c1 c2 c3;
    SUBC> mean c2 c3.
    
    ANALYSIS OF VARIANCE  C1      
    SOURCE        DF        SS        MS
    C2             2     22.72     11.36
    C3             2    198.22     99.11
    INTERACTION    4      3.28      0.82
    ERROR         27     71.00      2.63
    TOTAL         35    295.22
    
                           Individual 95% CI
          C2        Mean   ------+---------+---------+---------+-----
           1        5.42    (--------*---------)
           2        6.08          (---------*--------)
           3        7.33                       (--------*---------)
                           ------+---------+---------+---------+-----
                              5.00      6.00      7.00      8.00
    
                           Individual 95% CI
          C3        Mean   ---------+---------+---------+---------+--
           1        3.17   (----*----)
           2        6.83                     (----*----)
           3        8.83                               (----*----)
                           ---------+---------+---------+---------+--
                                 4.00      6.00      8.00     10.00
    

    6. Tables

    
    tally the data in c...c	prints a one-way table for each column. The default
    output contains frequency counts. The columns must 	contain integers
    from -9999 to 9999.
    	percents	output contains percents
    	cumcounts	output contains cumulative counts
    	cumpercents	output contains cumulative percents
    	all	output contains all four statistics
    
    For example,
    MTB > Retrieve 'pulse.MTW'
    MTB > tally 'sex' 'smokes'
    
         SEX  COUNT     SMOKES  COUNT
           1    57           1    28 
           2    35           2    64 
          N=    92          N=    92 
    
    MTB > tally 'sex';
    SUBC> percent.
    
         SEX  PERCENT
           1    61.96
           2    38.04
    
    chisquare test on table stored in c...c
    Does a c2 test for association on a contingency table that has been stored
    in the columns c...c.  In case of raw data, you need to form the
    contingency table first using the command TABLE with its subcommand
    chisquare. 
    
    table the data classified by c...c : displays one-way, two-way,
    multiple-way tables.  Some subcommands for table are as follows:
    
    counts		includes a count of total number of observations in each cell
    rowpercents	includes a row percent in each cell
    colpercents	includes a row percent in each cell
    totpercents 	includes the total percent in each cell
    chisquare [k]	does a c2 test of independence between the rows and columns
    of each two-way table printed. k=1, the default, only count will be put in
    each cell; k=2 says put the count and expected values, under the assumption
    of independence, into each cell. k=3 says put the count, expected values,
    and standardized residual into each cell.
    
    An example of table:
    MTB > Retrieve 'pulse.MTW'
    MTB > table 'sex' 'activity';
    SUBC> chisquare 2.
       
     ROWS: SEX     COLUMNS: ACTIVITY
     
               0        1        2        3      ALL
      
      1        1        5       35       16       57
            0.62     5.58    37.79    13.01    57.00
      
      2        0        4       26        5       35
            0.38     3.42    23.21     7.99    35.00
      
     ALL       1        9       61       21       92
            1.00     9.00    61.00    21.00    92.00
     
    CHI-SQUARE =     3.118   WITH D.F. =    3
     
      CELL CONTENTS --
                      COUNT
                      EXP FREQ
    
    MTB > table 'smokes' 'sex';
    SUBC> rowpercents.
      
     ROWS: SMOKES     COLUMNS: SEX
     
               1        2      ALL
      
      1    71.43    28.57   100.00
      2    57.81    42.19   100.00
     ALL   61.96    38.04   100.00
     
      CELL CONTENTS --
                      % OF ROW
    
    

    7. Random Data and Distribution

    random k observations into each of c...c
    	bernoulli	p = k
    	binomial	n = k p = k
    	poisson 	mu=k
    	integer	a=k  b= k
    	discrete	values in c, probabilities in c
    	normal	[mu=k [sigma=k]]
    	uniform	[a=k  b=k]
    	t	df=k
    	f	df1=k   df2=k
    	chisquare	df=k
    	
    Generates a separate random sample of k observations into each column, from
    a distribution specified by a subcommand. If no subcommand is given, data
    are generated from standard normal distribution. 
    
    Example: Generate 20 samples of size 50 from a normal population with mean
    = 2, standard deviation =1. 
    
    Solution: 
    We can use commands 
    
    MTB > random 50 c1-c20;
    SUBC > normal 2 1.
    to make each column a random sample of size 50, or use
    
    MTB > random 20 c1-c50;
    SUBC >normal 2 1.
    
    to make each row a random sample of size 50. The latter one is useful when
    calculation of sample averages for all the random samples is required.
    Command RMEAN c1-c50 c51 puts the 20 sample means into column c51.
    
    sample k rows from c...c, put into c...c
    	replace
    
    Takes a random sample of k rows from each listed column and put into
    another column. When subcommand replace is used, sampling is done with
    replacement; otherwise, without replacement sampling is done. For example,
    to randomly select 6 numbers without replacement from integers 1 to 49, we
    can do it as follows.
    
    MTB > set c1
    DATA> 1:49
    DATA> end
    MTB > sample 6 c1 c2
    MTB > print c2
    C2  
       43    25     3    31    11    17 
    
    Quick pick (Lotto 6/49), anyone?
    
    pdf for values in E[put into E]
    
    	bernoulli	p = k
    	binomial	n = k p = k
    	poisson 	mu=k
    	interger	a=k  b= k
    	discrete	values in c, probabilities in c
    	normal	[mu=k [sigma=k]]
    	uniform	[a=k  b=k]
    	t	df=k
    	F	df1=k   df2=k
    	chisquare	df=k
    
    For a discrete distribution, PDF calculates probabilities for the specified
    values in E. For a continuous distribution, PDF calculates the probability
    density function. For example,
    MTB > set c3
    DATA> 1 4 5
    DATA> end
    MTB > pdf c3;
    SUBC> binomial 5 0.5.
           K          P( X = K)
         1.00            0.1562
         4.00            0.1562
         5.00            0.0312
    
    MTB > pdf c3;
    SUBC> normal 3 2.
        1.0000    0.1210
        3.0000    0.1995
        5.0000    0.1210
    
    CDF for values in E ...E[put into E...E]
    
    	bernoulli	p = k
    	binomial	n = k p = k
    	poisson 	mu=k
    	interger	a=k  b= k
    	discrete	values in c, probabilities in c
    	normal	[mu=k [sigma=k]]
    	uniform	[a=k  b=k]
    	t	df=k
    	f	df1=k   df2=k
    	chisquare	df=k
    
    CDF, cumulative distribution function, calculates the probability that a
    random variable X has a value less than or equal to x. That is CDF(x) = Pr
    (X2x). For example,
    MTB > cdf c3;
    SUBC> binomial 5 0.5.
           K       P(X LESS OR = K)
         1.00            0.1875
         3.00            0.8125
         5.00            1.0000
    MTB > cdf c3;
    SUBC> normal 3 2.
        1.0000    0.1587
        3.0000    0.5000
        5.0000    0.8413
    
    INVCDF for values in E put into E]
    
    	bernoulli	p = k
    	binomial	n = k p = k
    	poisson 	mu=k
    	interger	a=k  b= k
    	discrete	values in c, probabilities in c
    	normal	[mu=k [sigma=k]]
    	uniform	[a=k  b=k]
    	t	df=k
    	f	df1=k   df2=k
    	chisquare	df=k
    
    INVCDF finds a value x corresponding to a given probability p with respect
    of the specified distribution. For example,
    MTB > invcdf 0.05;
    SUBC> t 15.
        0.0500   -1.7531
    
    Command INVCDF is particularly useful in finding a critical value of a given critical level a in doing test of hypotheses or computing confidence intervals. Acknowledgment I would like to thank Myra Andrews for her most helpful assistance in editing this manual.