Preface

Since 2008, I have experienced to give several seminars in foreign countries and provide lectures for international students/trainees in my country. In most cases, I used ppt slides to illustrate some theories and methods and then employed excel sheets and example codes of R for hands-on exercises. Still I have been using those materials but I have been also developing my handout for convenience for audiences as well as myself.

The first part of this manuscript was originally prepared for a seminar given in the Inland Fishery Resources Development And Management Department (IFRDMD) in Indonesia on December 26-28, 2018. The manuscript was extended this time to cover several additional topics for seminar given in the National Aquatic Resources Research & Development Agency (NARA) on March 13-15, 2019. However, still the contents have been updated and modified because of necessities to improve texts and correct some codes based on feedback from participants of the seminar/lecture. In addition, I may develop it to a printed version of the text. Therefore, please DO NOT pass this document to anyone else without author’s permission. I hope this is understandable.

Acknowledgement

I would like to express my sincere appreciation for the invitation by NARA to give me a nice opportunity of the seminar and visit Colombo. I would also like to thank my friend, Sujeewa Haputhantri, for his excellent coordination and thank Steve Creech for his kind arrangement.

Information on the author

Dr. Toshihide Kitakado is a professor of Tokyo University of Marine Science and Technology (Lab name is Fish population analysis). He graduated from a department of math education and then learned probability and theoretical statistics more intensively in grad course. Currently he has been working for stock assessment and management for several species such as tunas, saury, sandlance, horse mackerel, whales, seals and sea lions.

He has been and is now involved in a lot of international fishery organizations:

Chair of Working Group on In-depth Assessment for the western North Pacific minke whales in the International Whaling Commission (IWC) [2006-2009]
Chair of Subcommittee on bowhead, right and gray in IWC [2010-2012]
Chair of the Scientific Committee of IWC [2012-2015]
Chair of Working Group on Ecosystem Modelling in IWC [2016-]
Vice-Chair of Working Party on Temperate Tunas in IOTC [2016-present]
Chair of Working Party on Methods in IOTC [2016-present]
Chair of Technical Working Group on Pacific saury stock assessment in NPFC [2017-present]

He has been contributing to other international and domestic activities;

Lecture in JICA (Japan International Cooperation Agency) [2010-17]
Invited lecturer for Korean Fishery Institute [2017-]
Seminars in US, Canada, Spain, Norway, Korea, China, Malaysia, Indonesia, Thailand
Set annual quotas for harbor seals, Steller sea lions and Antarctic minke whales
External reviewers for domestic fishery stock assessment [2012-present]
Member of Scientific Committee of conservation and management of harbor seal [2014-present]
Permanent Editor of “Cetacean Population Studies” [2018-present]
Member of Editorial Board of “Journal of Fisheries Technology” [2018-present]
Member of Editorial Board of “Fisheries Science” [2019]

Currently, he is supervising 19 students (2 PhD, 11 masters and 6 undergraduates) and will keep the same size in new fiscal year 2019 (3 PhD, 11 masters and 5 undergrads).

Not very sure if I can meet your requests, but if you have any matters for consultation, I would like to welcome to hear!

% Workshop for consultation

Preparation of R, Rstudio and Rmarkdown

Before starting the seminar, please visit the following website and download the applications which we will use in the seminar.

R

First, please visit the following web site and download “R 3.5.xx for Windows” for example:

https://cran.r-project.org/

Rstudio

Then please visit the web site below and download the free version of “RStudio Desktop”:

https://www.rstudio.com/products/rstudio/download/

Rmarkdown

I will explain how you can create a memo/report of your analysis with R in a easy way with Rmarkdown. I’m writing this handout using it.

Required libraries

Please install the following packages in R by using a command before the seminar.

install.packages("package name")

install.packages("knitr")
install.packages("rmarkdown")
install.packages("kableExtra")

install.packages("ggplot2")
install.packages("gridExtra")

install.packages("mgcv")
install.packages("fields")
install.packages("ggmap")
install.packages("marmap")
install.packages("mapdata")

install.packages("shiny")

Then you can call those libraries.

library(knitr)
library(rmarkdown)
library(kableExtra)

library(ggplot2)
library(gridExtra)

library(mgcv)
library(fields)
library(ggmap)
library(marmap)
library(mapdata)

library(shiny)

#May not be used (for TK-memo)
#library(boot) # for regression diagnosis
#library(MASS)
#library(MuMIn)
#library(dplyr)

Lecture 1. Overview of stock assessment and management

I would like to begin with a very brief introduction of stock assessment and management in a general sense. Message here are as follows:

Studies on fishery stock assessment and management are, in a sense, “science of uncertainty” and “science under uncertainty”. We will have to face with lots of types of uncertainty (see examples below). Therefore, we should think about how we can assess the extent of the level of uncertainty and how to deal with them.

data uncertainty (data include observation error. Sometime with unignorable bias …)
design uncertainty (even a nice survey design is vulnerable to bad conditions)
parameter uncertainty (many types: natural mortality, fecundity, steepness, …)
estimation uncertainty (information on data is not perfect)
model uncertainty (human-made model may not be perfect)
process uncertainty (human-made model cannot capture random variation even the deterministic models are correct)
implementation uncertainty (TAC set by scientists may not be followed by fisheries)
human errors (coding mistakes, data handling errors, reporting errors,…)
anything else?

You may see a lack of information when you try to conduct stock assessment and develop any management plans. However, no one has a perfect information. Not late to start to collect data and information on biology/ecology, abundance, and other key parameters. If you don’t think it seriously at this moment, you have the same problem 5 or 10 years later. Let’s set short-term targets and also medium- or longer-term objectives.
Statistical works including just visual presentation of data and map and summaries as simple tables are of course useful. However, more elaborated analyses such as modelling and its exploration, statistical inference (estimation, hypothesis testing and forecasting), and risk assessment are so useful and mandate for current and future stock assessment and management.
Others in the slides…

Lecture 2. How to use R via Rstudio

Getting R sessions started with basic arithmetics and handling the list

 a <- c(1,2,3,4,5)
 a

[1] 1 2 3 4 5

 sum(a)

[1] 15

 mean(a)

[1] 3

 a+10

[1] 11 12 13 14 15

5*a

[1]  5 10 15 20 25

a^2

[1]  1  4  9 16 25

 log(a)

[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379

 a[3]

[1] 3

 a[c(1,2,3)]

[1] 1 2 3

 a[-c(2,4)]

[1] 1 3 5

 b <- seq(10,50,10)
 a+b

[1] 11 22 33 44 55

a*b

[1]  10  40  90 160 250

b/a

[1] 10 10 10 10 10

For loop

ss <- 0
 for(i in 1:10){
  ss <- ss + i  
}
ss

[1] 55

# Just in case, this is done simply as 
a<- seq(1,10)
a

 [1]  1  2  3  4  5  6  7  8  9 10

sum(a)

[1] 55

Monte Carlo simulaiton (for gambling, not an approriate example???)

You have a capital money (e.g. $20)
win/loose with prob = 0.5
Betting (e.g. $1)
Go into “ruin” if you loose all of your money

Start with your capital $20 and contiune 100 times betting with $1.

p <- 0.5
Bet <- 1 
Capital <- 20
Maxit <- 100 + 1

X <- numeric(Maxit)
X[1] <- Capital

for(j in 2:Maxit)
{
 if(X[j-1]>0) X[j] <- X[j-1] + sample(Bet*c(-1,1),1,prob=c(1-p,p)) 
}

plot(X, type="l", ylim=c(0, max(X)*1.2), xlab="Trial", ylab="State")
abline(Capital, 0, lty=2, col="red")

We now define a function for the computation above and then run it under different setting.

Gambling1 <- function(p, Bet, Capital, Maxit=101, Fig=F)
{
  X <- numeric(Maxit)
  X[1] <- Capital

  for(j in 2:Maxit)
  {
   if(X[j-1]>0) X[j] <- X[j-1] + sample(Bet*c(-1,1),1,prob=c(1-p,p)) 
  }
  
  if(Fig==T)
  {  
   plot(X, type="l", ylim=c(0, max(X)*1.2), xlab="Trial", ylab="State")
   abline(Capital, 0, lty=2, col="red")
  }
}

Let us run function “Gambling1”.

par(mfrow=c(2,2), mar=c(3,3,3,3))
Gambling1(p=0.5, Bet=1, Capital=20, Fig=T)
Gambling1(p=0.5, Bet=1, Capital=20, Fig=T)
Gambling1(p=0.5, Bet=1, Capital=10, Fig=T)
Gambling1(p=0.5, Bet=1, Capital=10, Fig=T)

We are now extending the function above to deal with multiple simulations.

Gambling2 <- function(p, Bet, Capital, Maxit=101, Nsim=100)
{
  X <- array(0, c(Nsim, Maxit))

  for(i in 1:Nsim)
  {
    X[i,1] <- Capital
    for(j in 2:Maxit)
    {
     Tmp <- X[i,(j-1)] + sample(Bet*c(-1,1),1,prob=c(1-p,p))
     if(Tmp >0) X[i,j] <- Tmp 
     if(Tmp<=0) X[i,j] <- -10     
    }
  }
  plot(X[1,], type="l", ylim=c(0, max(X)*1.2), 
       xlab="Trial", ylab="State", col="grey")
  for(i in 1:Nsim) points(X[i,], type="l", col="grey")
  abline(Capital, 0, lty=2, col="red")
  hist(X[,Maxit], col="orange", xlab="Final state", main="")
  return(X)
}

Then again let us run function “Gambling2”

par(mfrow=c(2,2), mar=c(3,3,3,3))
Res <- Gambling2(p=0.5, Bet=1, Capital=20, Maxit=100, Nsim=100)
Res <- Gambling2(p=0.5, Bet=1, Capital=20, Maxit=1000, Nsim=100)

Simple analysis for body proportion of horse mackerel

We will conduct a very simple analysis for body proportion of Japanese fish species with statistical testing and visual presentation. We first read a data file named as “Horsemackerel.csv”.

# Read the data file
Data <- read.csv("Data/Horsemackerel.csv", header=T)
head(Data)

  Species  TL  FL  SL HL BD   ED
1   Maaji 297 259 235 65 59 15.4
2   Maaji 285 263 255 65 61 15.2
3   Maaji 285 255 263 65 63 15.1
4   Maaji 319 275 269 71 63 16.6
5   Maaji 297 259 235 65 59 15.1
6   Maaji 194 172 160 45 42 13.6

# Extract the data column and create a new object "proportion"
Data$BD

 [1] 59 61 63 63 59 42 42 50 46 44 40 53 34 38 38 36 36 36 34 38 37 37 40
[24] 36 34 38 37 37 40 38 31 30 36 31 39 37 36 41 31 30 36 31 37 41 45 37
[47] 38 31 30 36 31 37 41 45 37 38 31 30 36 31 37 41 45 37 38 39 31 30 36
[70] 31 39 37 36 41 31 30 36 31 37 41 45 37 38 39 39 37 37 36 36 35
 [ reached getOption("max.print") -- omitted 85 entries ]

Data$SL

 [1] 235 255 263 269 235 160 174 182 171 168 152 203 137 149 150 145 146
[18] 145 136 148 137 145 155 145 136 148 137 145 155 145 118 122 138 133
[35] 142 129 134 152 137 143 157 150 148 150 175 166 161 137 143 157 150
[52] 148 150 175 166 161 137 143 157 150 148 150 175 166 161 141 118 122
[69] 138 133 142 129 134 152 137 143 157 150 148 150 175 166 161 205 213
[86] 215 221 189 193 197
 [ reached getOption("max.print") -- omitted 85 entries ]

Proportion <- Data$BD/Data$SL

# Add "Prop" into the current data set "Data"
Data <- cbind(Data, Prop=Proportion)
attach(Data)

visual presentation

Quick presentation

# Exracting data
SP <- as.numeric(Species)
SPname <- levels(Species)
SP

 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[36] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[71] 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4
 [ reached getOption("max.print") -- omitted 85 entries ]

SPname

[1] "Kaiwari" "Maaji"   "Maruaji" "Moro"

# Scatter plot andboxplot
par(mfrow=c(1,2)) #Splitting screen
plot(BD~SL, pch=SP, col=SP, xlab="Standard Length (SL)", ylab="Body Depth (BD)")
legend(220, 85, legend=levels(Species), pch=1:4, col=1:4, bty="n") 
boxplot(Prop~Species, ylab="Proportion")

# Histogram of proportion by species
par(mfcol=c(2,2))
for(i in 1:length(SPname)) hist(Prop[SP==i], xlab="Prop", main=paste(SPname[i]), col="gray")

Analysis

Summary statistics

y <- split(Prop, Species)
y$Kaiwari

 [1] 0.4893617 0.5074627 0.5056818 0.5080645 0.4916667 0.4556213 0.4528302
 [8] 0.4962406 0.4714286 0.4771242 0.4838710 0.4887640 0.4842767 0.4838710
[15] 0.4842767 0.4887640 0.4714286 0.4771242 0.4838710 0.4864865 0.4831461
[22] 0.4410256 0.4462810 0.4410256 0.4462810 0.4410256 0.4462810 0.4831461
[29] 0.4864865 0.4410256 0.4462810

length(y$Kaiwari)

[1] 31

sapply(y, length)

Kaiwari   Maaji Maruaji    Moro 
     31      83      15      46

lapply(y, length)

$Kaiwari
[1] 31

$Maaji
[1] 83

$Maruaji
[1] 15

$Moro
[1] 46

Ns <- sapply(y, length)
Mean <- sapply(y, mean)
SD <- sapply(y, sd)
data.frame(Nsample=Ns, Mean, SD)

        Nsample      Mean         SD
Kaiwari      31 0.4738781 0.02159171
Maaji        83 0.2477849 0.02045780
Maruaji      15 0.2273268 0.01283770
Moro         46 0.1908011 0.01381693

Example of one sample test (Null hypothesis, H0:mu=0.2)

lapply(y, t.test, mu=0.2)

$Kaiwari

    One Sample t-test

data:  X[[i]]
t = 70.624, df = 30, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0.2
95 percent confidence interval:
 0.4659582 0.4817980
sample estimates:
mean of x 
0.4738781 


$Maaji

    One Sample t-test

data:  X[[i]]
t = 21.28, df = 82, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0.2
95 percent confidence interval:
 0.2433179 0.2522520
sample estimates:
mean of x 
0.2477849 


$Maruaji

    One Sample t-test

data:  X[[i]]
t = 8.2442, df = 14, p-value = 9.634e-07
alternative hypothesis: true mean is not equal to 0.2
95 percent confidence interval:
 0.2202175 0.2344360
sample estimates:
mean of x 
0.2273268 


$Moro

    One Sample t-test

data:  X[[i]]
t = -4.5155, df = 45, p-value = 4.527e-05
alternative hypothesis: true mean is not equal to 0.2
95 percent confidence interval:
 0.1866980 0.1949042
sample estimates:
mean of x 
0.1908011

res.t.test <- lapply(y, t.test, mu=0.2)

Two sample test (if the mean is equal between two speciesor not)

# variance same? 
var.test(y[[2]],y[[3]])


    F test to compare two variances

data:  y[[2]] and y[[3]]
F = 2.5395, num df = 82, denom df = 14, p-value = 0.05378
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.9838916 5.1560229
sample estimates:
ratio of variances 
          2.539471

# t-test
t.test(y[[2]],y[[3]], var.equal=T)


    Two Sample t-test

data:  y[[2]] and y[[3]]
t = 3.7332, df = 96, p-value = 0.0003205
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.009580294 0.031336083
sample estimates:
mean of x mean of y 
0.2477849 0.2273268

t.test(y[[2]],y[[3]])


    Welch Two Sample t-test

data:  y[[2]] and y[[3]]
t = 5.1098, df = 28.765, p-value = 1.91e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.01226681 0.02864956
sample estimates:
mean of x mean of y 
0.2477849 0.2273268

Analysis of variance (test if all the mean are equal or not)

Under an assumption of “common variance” across species,

res.lm <- lm(Prop~Species)
summary(res.lm)


Call:
lm(formula = Prop ~ Species)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.041118 -0.012597  0.002215  0.011178  0.043030 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.473878   0.003342  141.80   <2e-16 ***
SpeciesMaaji   -0.226093   0.003917  -57.73   <2e-16 ***
SpeciesMaruaji -0.246551   0.005852  -42.13   <2e-16 ***
SpeciesMoro    -0.283077   0.004324  -65.47   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.01861 on 171 degrees of freedom
Multiple R-squared:  0.9653,    Adjusted R-squared:  0.9647 
F-statistic:  1584 on 3 and 171 DF,  p-value: < 2.2e-16

res.lm <- lm(Prop~Species-1)
summary(res.lm)


Call:
lm(formula = Prop ~ Species - 1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.041118 -0.012597  0.002215  0.011178  0.043030 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
SpeciesKaiwari 0.473878   0.003342  141.80   <2e-16 ***
SpeciesMaaji   0.247785   0.002042  121.32   <2e-16 ***
SpeciesMaruaji 0.227327   0.004804   47.32   <2e-16 ***
SpeciesMoro    0.190801   0.002743   69.55   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.01861 on 171 degrees of freedom
Multiple R-squared:  0.9959,    Adjusted R-squared:  0.9958 
F-statistic: 1.048e+04 on 4 and 171 DF,  p-value: < 2.2e-16

res.anova1 <- anova(res.lm)
res.anova1

Analysis of Variance Table

Response: Prop
           Df  Sum Sq Mean Sq F value    Pr(>F)    
Species     4 14.5072  3.6268   10476 < 2.2e-16 ***
Residuals 171  0.0592  0.0003                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In case of “heterogeneity in variance”,

res.bartlett <- bartlett.test(Prop~Species)
res.bartlett


    Bartlett test of homogeneity of variances

data:  Prop by Species
Bartlett's K-squared = 12.515, df = 3, p-value = 0.005813

res.anova2 <- oneway.test(Prop~Species, var.equal = FALSE)
res.anova2


    One-way analysis of means (not assuming equal variances)

data:  Prop and Species
F = 1373.1, num df = 3.00, denom df = 54.97, p-value < 2.2e-16

Pairwise t-test

pairwise.t.test(Prop,Species)


    Pairwise comparisons using t tests with pooled SD 

data:  Prop and Species 

        Kaiwari Maaji   Maruaji
Maaji   < 2e-16 -       -      
Maruaji < 2e-16 0.00013 -      
Moro    < 2e-16 < 2e-16 9.8e-10

P value adjustment method: holm

pairwise.t.test(Prop,Species, pool.sd = FALSE)


    Pairwise comparisons using t tests with non-pooled SD 

data:  Prop and Species 

        Kaiwari Maaji   Maruaji
Maaji   < 2e-16 -       -      
Maruaji < 2e-16 1.9e-05 -      
Moro    < 2e-16 < 2e-16 1.9e-09

P value adjustment method: holm

pairwise.t.test(Prop,Species, pool.sd = FALSE, p.adj = "bonf")


    Pairwise comparisons using t tests with non-pooled SD 

data:  Prop and Species 

        Kaiwari Maaji   Maruaji
Maaji   < 2e-16 -       -      
Maruaji < 2e-16 0.00011 -      
Moro    < 2e-16 < 2e-16 5.7e-09

P value adjustment method: bonferroni

Lecture 3. Several important biological parameters for stock assessment and management

We shall start with some regression analyses for estimating biological parameters using some regression methods.

In R, lots of types of regression analyses can be conducted (see table below).

Estimation of allometry formula in R

The common allometric function is defined as follows: \[ W = a L^b. \]

Our example data set is a combination of length and weight observed for rainbow trouts in Ohizumi experimental station in Japan.

We shall start reading the data and showing a simple scatter plot.

# Reading and plotting the data 
Data <- read.csv("Data/Rainbowtrout.csv", header=T)
names(Data)

[1] "Length" "Weight"

head(Data)

  Length Weight
1    7.5     10
2    9.2      9
3    9.5     25
4    9.7     35
5   10.5     20
6   10.6     35

Length <- Data$Length
Weight <- Data$Weight
plot(Length, Weight, xlab="Length (cm)", ylab="Weight (g)", 
     main="Allometric data for rainbow trouts")

A simple regression analysis for estimating allometric curve

Then we conduct a simple regression analysis for the following simple linear model.

\[ \log W_i = \log a + b \log L_i + \varepsilon_i, \quad \varepsilon_i \sim N(0,\sigma^2). \]

# Conducting regression analysis 
res <- lm(log(Weight)~log(Length))
summary(res)


Call:
lm(formula = log(Weight) ~ log(Length))

Residuals:
     Min       1Q   Median       3Q      Max 
-0.59889 -0.08977 -0.00832  0.14493  0.69146 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -3.04898    0.28442  -10.72 6.67e-13 ***
log(Length)  2.60235    0.08474   30.71  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.286 on 37 degrees of freedom
Multiple R-squared:  0.9622,    Adjusted R-squared:  0.9612 
F-statistic: 943.1 on 1 and 37 DF,  p-value: < 2.2e-16

plot(log(Length), log(Weight))
abline(res)

We can extract the parameter estimates of $\alpha=\log a$ and $\beta=b$.

# List of objects
names(res)

 [1] "coefficients"  "residuals"     "effects"       "rank"         
 [5] "fitted.values" "assign"        "qr"            "df.residual"  
 [9] "xlevels"       "call"          "terms"         "model"

names(summary(res))

 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

# Extracting parameter estimates (the both are same meanings)
summary(res)$coef

             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) -3.048982 0.28442057 -10.71998 6.667534e-13
log(Length)  2.602353 0.08474089  30.70953 6.255463e-28

coef(res)

(Intercept) log(Length) 
  -3.048982    2.602353

alpha.est <- as.numeric(coef(res)[1])
beta.est <- as.numeric(coef(res)[2])

# Parameter transformation
a.est <- exp(alpha.est)
b.est <- beta.est
Est <- c(a.est, b.est); Est

[1] 0.04740718 2.60235273

We can also extract confidence intervals of parameter estimates.

# Use a function "confint"
confint(res)

                2.5 %    97.5 %
(Intercept) -3.625272 -2.472691
log(Length)  2.430651  2.774054

confint(res)[1,]

    2.5 %    97.5 % 
-3.625272 -2.472691

confint(res)[2,]

   2.5 %   97.5 % 
2.430651 2.774054

# Alternatively.... 
confint(res, "(Intercept)")

                2.5 %    97.5 %
(Intercept) -3.625272 -2.472691

confint(res, "log(Length)")

               2.5 %   97.5 %
log(Length) 2.430651 2.774054

# For the following summary
alpha.ci <- confint(res)[1,]
beta.ci <- confint(res)[2,]
a.ci <- exp(alpha.ci)
b.ci <- beta.ci
CI <- rbind(a.ci, b.ci); CI

          2.5 %     97.5 %
a.ci 0.02664184 0.08435757
b.ci 2.43065137 2.77405408

Then a summary is given as follows:

Out <- cbind(Est, CI)
row.names(Out) <- c("a", "b")
Out

         Est      2.5 %     97.5 %
a 0.04740718 0.02664184 0.08435757
b 2.60235273 2.43065137 2.77405408

kable(Out)

	Est	2.5 %	97.5 %
a	0.0474072	0.0266418	0.0843576
b	2.6023527	2.4306514	2.7740541

In case of PDF using LaTeX Syntax,

# For latex output (may not be effective in html document)
kable(Out, booktabs=T, longtable=F, caption="Summary of parameter estimates.") %>%
 kable_styling(latex_options = c("striped", "hold_position"))

Summary of parameter estimates.
	Est	2.5 %	97.5 %
a	0.0474072	0.0266418	0.0843576
b	2.6023527	2.4306514	2.7740541

Diagnostics

In the general formulation of regression analysis, \[ y_i = \alpha + \beta x_i + \varepsilon_i, \quad \varepsilon_i \sim N(0,\sigma^2). \] the standardized residual defined as \[ e_i = \frac{(observation)-(fitted)}{estimate \ of \ sd} = \frac{y_i-(\hat{\alpha} + \hat{\beta} x_i )}{\hat{\sigma}} \] would be distributed to the standard normal distribution N(0,1). Let’s see if this is the case or not.

# Diagnostics
names(summary(res))

 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

sigma.est <- summary(res)$sigma

# Standardized residuals, which should be distributed to N(0,1)
e <- (log(Weight)-res$fitted.values)/sigma.est

# Plot and histogram
plot(res$fitted.values, e, ylim=c(-2.5, 2.5), ylab="Standardized residuals")
abline(0,0)
abline(qnorm(0.025),0,lty=2)
abline(qnorm(0.975),0,lty=2)

hist(e, xlab="Standardized residuals", main="Histogram of standardized residuals")

# Default plots in "lm" in R
par(mfrow=c(2,2), mar=c(4,4,3,2))
plot(res)

Confidence intervals of curve and prediction

There two types of confidence intervals: - Confidence interval for regression estimates
- Confidence interval for prediction

# Confidence interval of the curve
plot(Length, Weight, xlab="Length (mm)", ylab="Weight (g)", main="With CI for estimates")  
pred <- exp(predict(res, int="c", level = 0.95))
matlines(Length, pred, lty=c(1,2,2), col=1)

# Confidence interval of the prediction
plot(Length, Weight, xlab="Length (mm)", ylab="Weight (g)", main="With CI for prediction") 
pred <- exp(predict(res, int="p", level = 0.95))
matlines(Length, pred, lty=c(1,2,2), col=1)

Lecture 4. Hands-on exercises for estimation of maturity ogive (GLM)

Generalized linear model (GLM)

Why “generalized”
- Distribution assumption (not necessarily “normal distribution”)
- Binomial, Poisson, Gamma distributions etc.
- Regression covariates are linked with a “link function”

GLM in R
- “glm” (almost same syntax as “lm”)
- Estimate, CI, testing, AIC, …

Maturity analysis

Assumption
- You are analyzing “age at sexual maturity (ASM)” of a fish species
- You observe the numbers of mature and immature individuals by age

Structural model
- Maturity is a function of age, $p(a)$
- $p(a)$ should be a monotonically increasing function of $a$ like a sigmoid function:

Logistic function
$p(a)=\displaystyle \frac{e^{\beta_0 + \beta_1 a}}{1+e^{\beta_0 + \beta_1 a}} \ \Longrightarrow \ \log \frac{p(a)}{1-p(a)} = \beta_0 + \beta_1 a$
Probit function
$p(a)=\displaystyle \int_{-\infty}^a f(z) dz$

Drawing logistic function

# Definition of logistic function (the parameters are beta0 and beta1)
LogisticFn <- function(x,beta0,beta1) {1.0/(1.0+exp(-beta0-beta1*x))}
Age <- seq(0,10,0.01)

# beta0=-5, beta1=1
plot(Age, LogisticFn(Age,-5,1), type="l", ylab="Maturity", lty=1)

# beta0=-2.5, beta1=0.5
points(Age, LogisticFn(Age,-2.5,0.5), type="l", lty=2)

# beta0=-10, beta1=2
points(Age, LogisticFn(Age,-10,2), type="l", lty=3)

# legends
legend(0, 1, lty=1:3, 
 legend=c("beta0=-5, beta1=1","beta0=-2.5, beta1=0.5","beta0=-10, beta1=2"),bty="n")

Statistical model and generation of example data

Distributional assumption
- Two state (mature/immature) -> Binomial distribution
- $N_a$: number of sampled individuals of age $a$
- $Y_a$: number of mature individuals among $N_a$ individuals
- $p(a)$: maturity (probability) in age $a$
- $Y_a \sim Bin(N_a, p(a))$

set.seed <- 2018

Amin <- 1
Amax <- 10
AgeRange <- Age <- Amin:Amax
NAgeClass<- length(AgeRange)

# True beta0=-5, beta1=1
Nvec <- rep(10, NAgeClass)
pvec <- LogisticFn(AgeRange, -5, 1)
pvec

 [1] 0.01798621 0.04742587 0.11920292 0.26894142 0.50000000 0.73105858
 [7] 0.88079708 0.95257413 0.98201379 0.99330715

# Generation of simulation data
Yvec <- rbinom(NAgeClass, Nvec, pvec)
Yvec

 [1]  0  0  1  1  4  6  8  9 10 10

Maturity.obs <- Yvec/Nvec
Maturity.obs

 [1] 0.0 0.0 0.1 0.1 0.4 0.6 0.8 0.9 1.0 1.0

# Plotting the data
plot(Age, LogisticFn(Age,-5,1), type="l", ylab="Maturity")
points(Age, Maturity.obs, pch="*")
legend(6, 0.4, lty=1:0, pch=c("","*"), legend=c("True","Observed"), bty="n")

Statistical estimation of maturity curve using “glm” function

Res.glm.logit <- glm(cbind(Yvec,Nvec-Yvec)~Age, family=binomial)
Res.glm.logit


Call:  glm(formula = cbind(Yvec, Nvec - Yvec) ~ Age, family = binomial)

Coefficients:
(Intercept)          Age  
     -6.196        1.106  

Degrees of Freedom: 9 Total (i.e. Null);  8 Residual
Null Deviance:      82.16 
Residual Deviance: 2.005    AIC: 19.62

# Summary
summary(Res.glm.logit)


Call:
glm(formula = cbind(Yvec, Nvec - Yvec) ~ Age, family = binomial)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.6076  -0.3936  -0.1285   0.3960   0.6784  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -6.1955     1.2441  -4.980 6.36e-07 ***
Age           1.1062     0.2141   5.166 2.40e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 82.1559  on 9  degrees of freedom
Residual deviance:  2.0048  on 8  degrees of freedom
AIC: 19.621

Number of Fisher Scoring iterations: 5

# Parameter estimates
Res.glm.logit$coef

(Intercept)         Age 
  -6.195516    1.106192

beta0.est <- Res.glm.logit$coef[1]
beta1.est <- Res.glm.logit$coef[2]

# CI of the paramters
confint(Res.glm.logit)

                 2.5 %    97.5 %
(Intercept) -9.0616959 -4.098096
Age          0.7465759  1.601501

# Visual presentation of results
plot(Age, LogisticFn(Age,-5,1), type="l", ylab="Maturity")
points(Age, Maturity.obs)
points(Age, LogisticFn(Age,beta0.est,beta1.est), type="l", col="red")

Confidence interval for the curve

# Estimates
pred.se<-predict (Res.glm.logit, se.fit=TRUE)  #Linear predictor

#Linear predictor's CI
Lower <- pred.se$fit - qnorm(0.975, 0, 1)* pred.se$se.fit
Upper <- pred.se$fit + qnorm(0.975, 0, 1)* pred.se$se.fit

#Logistic transformation of CI
logit.t<-function(x){ 1/(1+exp(-x)) }
pred.logit<- sapply(pred.se$fit, logit.t)
Lower.logit <- sapply(Lower, logit.t)
Upper.logit<- sapply(Upper, logit.t)
pcon.logit<-cbind(pred.logit, Lower.logit, Upper.logit)
plot(AgeRange, Maturity.obs, xlim=c(0,10), ylim=c(0,1))
matlines(AgeRange, pcon.logit, lty=c(1,2,2), col="blue", lwd=1.5)

Lecture 5. Hands-on exercises for estimation of growth formula (nonlinear regression)

As in the case of allometric function, we will conduct a regression analysis. However, we will analyze the data without any transformation using the “nonlinear regression model”.

A common and general formulation is as follows: \[ y_i = f(x_i, \theta) + \varepsilon_i, \quad \varepsilon_i \sim N(0,\sigma^2). \]

Here, we use a data set for alfonsino (in Japanese, “Kinmedai”). Like in the case of rainbow trouts, we shall read the data file.

Set up

Data <- read.csv("Data/Alfonsino.csv", header=T)
Age <- Data$Age
Length <- Data$Length
Sex <- as.numeric(Data$Sex) #"1" for female, "2" for male
mark <- c(19,2)
color <- c("red","blue")
plot(Length~Age,pch=mark[Sex],col=color[Sex],xlim=c(0,25),ylim=c(0,50))
legend(18,20,pch=mark,col=color,legend=c("Female","Male"),bty="n")

We will employ the famous von Bertalanffy growth function defined as \[ L_t = L_{\infty} (1-e^{-K (t-t_0)}). \]

# Definition of von Bertalanffy
growth.VB <- function(t,Linf,K,t0) Linf*(1.0-exp(-K*(t-t0)))

We then estimate the parameters $(L_{\infty}, K, t_0, \sigma)$ using “nls” function in R.

Nonlinear regression (1) under an assumption of common parameters between female and male (including $\sigma$)

We now assume a same growth function between female and male fish. We call the model as “res000” (“0” means common and “1” does different).

# Setting a set of initial values for parameters (except for sigma)
start <- list(Linf=max(Length),K=0.1,t0=0)

# Estimation
res.000 <- nls(Length~growth.VB(t=Age,Linf, K, t0), start=start)

# Showing the estimation summary
summary(res.000)


Formula: Length ~ growth.VB(t = Age, Linf, K, t0)

Parameters:
     Estimate Std. Error t value Pr(>|t|)    
Linf 47.95842    2.91479  16.453  < 2e-16 ***
K     0.08936    0.02413   3.704  0.00027 ***
t0   -6.38637    2.14489  -2.977  0.00324 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.564 on 215 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 5.63e-06

# Confidence interval for the growth paramters
confint(res.000)

             2.5%      97.5%
Linf  44.10249477 58.3766050
K      0.04465578  0.1360594
t0   -12.34296242 -3.3029532

# Visual presentation
taxis <- seq(0,25,0.5); tlen <- length(taxis);
pred.000 <- predict(res.000, list(Age=taxis))
plot(Length~Age,xlim=c(0,25),ylim=c(0,50), main="Results under res.000 with prediction CI")
points(taxis, pred.000, type="l")
points(taxis, pred.000-1.96*sigma(res.000), type="l", lty=2)
points(taxis, pred.000+1.96*sigma(res.000), type="l", lty=2)

# Diagnostics 
e <- residuals(res.000)/sigma(res.000)
plot(fitted(res.000), e, xlab="Fitted values", ylab="Standardized residuals")
abline(0,0); abline(-1.96,0,lty=2); abline(1.96,0,lty=2);

Nonlinear regression (2) under an assumption of different parameters between female and male (except for $\sigma$)

# Setting a set of initial values for female and male simultaneously
start <- list(Linf=rep(max(Length),2),K=c(0.1,0.1),t0=c(0,0))

# Estimation
res.111 <- nls(Length~growth.VB(t=Age,Linf[Sex], K[Sex], t0[Sex]), start=start)

# Showing the estimation summary
summary(res.111)


Formula: Length ~ growth.VB(t = Age, Linf[Sex], K[Sex], t0[Sex])

Parameters:
      Estimate Std. Error t value Pr(>|t|)    
Linf1 48.15050    4.52391  10.644  < 2e-16 ***
Linf2 47.70053    3.75965  12.687  < 2e-16 ***
K1     0.09797    0.04315   2.270  0.02418 *  
K2     0.08646    0.02995   2.887  0.00429 ** 
t01   -5.12851    3.38334  -1.516  0.13106    
t02   -6.94091    2.80413  -2.475  0.01410 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.567 on 212 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 3.178e-06

# Confidence interval for the growth paramters
confint(res.111)

              2.5%      97.5%
Linf1  43.09967561 81.7105984
Linf2  43.18112305 66.9658334
K1      0.02164258  0.1833475
K2      0.03031102  0.1453427
t01   -18.02360053 -0.9526554
t02   -16.13978541 -3.1310768

# Visual presentation
taxis <- seq(0,25,0.5); tlen <- length(taxis);
pred.female.111 <- predict(res.111, list(Age=taxis, Sex=rep(1,tlen)))
pred.male.111 <- predict(res.111, list(Age=taxis, Sex=rep(2,tlen)) )

plot(Length~Age,pch=mark[Sex],col=color[Sex],xlim=c(0,25),ylim=c(0,50), 
     main="Results under res.111 with prediction CI")
points(taxis, pred.female.111, type="l", col="red", lty=1)
points(taxis, pred.female.111-1.96*sigma(res.111), type="l", col="red", lty=2)
points(taxis, pred.female.111+1.96*sigma(res.111), type="l", col="red", lty=2)
points(taxis, pred.male.111, type="l", col="blue", lty=1)
points(taxis, pred.male.111-1.96*sigma(res.111), type="l", col="blue", lty=2)
points(taxis, pred.male.111+1.96*sigma(res.111), type="l", col="blue", lty=2)
legend(18,20,pch=mark,col=color,lty=c(1,2),legend=c("Female","Male"),bty="n")

# Diagnostics for female
e <- residuals(res.111)/sigma(res.111)
plot(fitted(res.111)[Sex==1], e[Sex==1], col="red", 
     xlab="Fitted values", ylab="Standardized residuals")
abline(0,0); abline(-1.96,0,lty=2); abline(1.96,0,lty=2);

# Diagnostics for male
plot(fitted(res.111)[Sex==2], e[Sex==2], col="blue", 
     xlab="Fitted values", ylab="Standardized residuals")
abline(0,0); abline(-1.96,0,lty=2); abline(1.96,0,lty=2);

Comparison of models

We now try to compare which model is appropriate in terms of model fitting. Statistical hypothesis testing can be used for sequential, but the easiest way is to use AIC (Akaike Information Criterion), which is defined as \[ AIC = -2*max(loglikelihood) + 2*(Num \ of \ parameters). \] Smaller AIC values indicate better models. In R, a function “AIC” can be used to get the value.

# AIC values. 
AIC(res.000)

[1] 1034.122

AIC(res.111)

[1] 1037.531

# Plot (not very different)
plot(Length~Age,pch=mark[Sex],col=color[Sex],xlim=c(0,25),ylim=c(0,50))
points(taxis, pred.000, type="l", lwd=2)
points(taxis, pred.female.111, type="l", col="red", lty=1)
points(taxis, pred.male.111, type="l", col="blue", lty=2)

Note 1: you can analyze the data with another assumption such as a case that only $L_{\infty}$ is different between sex (as res.100) by modifying the code above.

Note 2: the code above assume a constant sigma across all the ages. In our example, the age coverage is limited, and the residuals show a similar level of variance over the ages. However, in case the date include broad range of ages, better approach is to assume heteroscedastic variance or taking logarithm of the length data.

Lecture 6. Stock assessment by depletion method

Aims: including some extensions which I was not able to explain in September seminar

In Excel

See “DeLury_Excel_rev.xlsx” if you wished to use MS-Excel.

In R

Here is a very simple underlying formula.

\[ CPUE_t = q N_1 - q K_t \]

# Reading data
Data <- read.csv("Data/DeLury_R.csv")
Data

   Period Catch Effort  CPUE CumCatch CumEffort
1       1    60  170.6 0.352        0       0.0
2       2   274  453.8 0.604       60     170.6
3       3   240  513.4 0.467      334     624.4
4       4   244  714.4 0.342      574    1137.8
5       5   301  679.1 0.443      818    1852.2
6       6   151  419.9 0.360     1119    2531.3
7       7   127  470.3 0.270     1270    2951.2
8       8    90  318.4 0.283     1397    3421.5
9       9    31  136.8 0.227     1487    3739.9
10     10     7  177.6 0.039     1518    3876.7

CPUE <- Data$CPUE
CumCatch <- Data$CumCatch
CumEffort <- Data$CumEffort
plot(CumCatch, CPUE)

# De Lury method 1
Res.DeLury1 <- lm(CPUE~CumCatch)
summary(Res.DeLury1)


Call:
lm(formula = CPUE ~ CumCatch)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.16798 -0.03658  0.01883  0.06804  0.10617 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  5.098e-01  6.013e-02   8.478 2.87e-05 ***
CumCatch    -1.995e-04  5.883e-05  -3.391  0.00949 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1034 on 8 degrees of freedom
Multiple R-squared:  0.5897,    Adjusted R-squared:  0.5384 
F-statistic:  11.5 on 1 and 8 DF,  p-value: 0.00949

abline(Res.DeLury1, col="blue")

# Estimates by De Lury method 1
Para1 <- coef(Res.DeLury1); Para1

 (Intercept)     CumCatch 
 0.509796545 -0.000199483

q.est1 <- -Para1[2]
N.est1 <- Para1[1]/q.est1
data.frame(q.est1, N.est1)

              q.est1   N.est1
CumCatch 0.000199483 2555.589

\[ \log(CPUE)_t = \log(q N_1) - q L_t - M(t-1) \]

By assuming $M=0$, then

# De Lury method 2
Res.DeLury2 <- lm(log(CPUE)~CumEffort)
summary(Res.DeLury2)


Call:
lm(formula = log(CPUE) ~ CumEffort)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.36018 -0.09308  0.18410  0.37092  0.46580 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept) -0.5562541  0.3303927  -1.684   0.1308  
CumEffort   -0.0003425  0.0001338  -2.560   0.0336 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5946 on 8 degrees of freedom
Multiple R-squared:  0.4503,    Adjusted R-squared:  0.3816 
F-statistic: 6.553 on 1 and 8 DF,  p-value: 0.03365

plot(CumEffort, log(CPUE))
abline(Res.DeLury2, col="blue")

# Estimates by De Lury method 2
Para2 <- coef(Res.DeLury2); Para2

  (Intercept)     CumEffort 
-0.5562540837 -0.0003424973

q.est2 <- -Para2[2]
N.est2 <- exp(Para2[1])/q.est2
data.frame(q.est2, N.est2)

                q.est2   N.est2
CumEffort 0.0003424973 1674.036

Lecture 7. Hands-on exercises for CPUE standardization

Set up

Assume that there are observation of CPUE for a total of five fishing areas over 20 years (very ideal!). In addition, there have been two fishing gears used ever.

Narea <- 5
Nyear <- 20
Ngear <- 2

Noperation <- 50 #number of fishing operations per year 

Ref_mu <- 2
AreaEffect <- log(c(1,2,3,4,5))
YearEffect <- log(seq(1,2,length.out=Nyear))
GearEffect <- log(c(1,1.2))

AreaSelection <- rep(1,Nyear)
GearSelection <- seq(0.3, 0.7, length.out=Nyear)

Data generation

Data <- as.matrix(array(NA, c(Nyear*Noperation, 4)))
colnames(Data) <- c("Catch", "Year", "Area", "Gear")

for(y in 1:Nyear){
  for(i in 1:50){
    Tmp_Area <- sample(1:Narea, 1)
    Tmp_Gear <- sample(1:Ngear, 1, prob=c(GearSelection[y], 1-GearSelection[y]))
    Tmp_mu <- Ref_mu*exp(YearEffect[y] + AreaEffect[Tmp_Area] + GearEffect[Tmp_Gear])
    Tmp_Catch <- rpois(1,Tmp_mu)
    Data[(y-1)*Noperation + i, ] <- c(Tmp_Catch, y, Tmp_Area, Tmp_Gear)
  }
}
Data <- as.data.frame(Data)

Analysis by GLM (for simplicity by poisson assumption)

# GLM analysis
Res1 <- glm(Catch~factor(Year), family="poisson", data=Data)
Res1


Call:  glm(formula = Catch ~ factor(Year), family = "poisson", data = Data)

Coefficients:
   (Intercept)   factor(Year)2   factor(Year)3   factor(Year)4  
        1.7884          0.1689          0.2810          0.3540  
 factor(Year)5   factor(Year)6   factor(Year)7   factor(Year)8  
        0.3278          0.5122          0.4919          0.3726  
 factor(Year)9  factor(Year)10  factor(Year)11  factor(Year)12  
        0.4459          0.4858          0.4373          0.4960  
factor(Year)13  factor(Year)14  factor(Year)15  factor(Year)16  
        0.3931          0.4816          0.7097          0.8290  
factor(Year)17  factor(Year)18  factor(Year)19  factor(Year)20  
        0.6399          0.5930          0.8521          0.6780  

Degrees of Freedom: 999 Total (i.e. Null);  980 Residual
Null Deviance:      3915 
Residual Deviance: 3521     AIC: 7446

# Extraction of CPUE trend, which is the trend estimate of population abundance
plot(Ref_mu*exp(YearEffect), type="l")
points(exp(Res1$coefficients[1:20]))

Lecture 8. Simple spatial mapping with GAM

Set up

The data we are using as an example is loaded as follows:

data(mack, package="gamair")
str(mack)

'data.frame':   634 obs. of  16 variables:
 $ egg.count  : num  0 0 0 1 4 3 0 1 0 0 ...
 $ egg.dens   : num  0 0 0 18.3 90.2 ...
 $ b.depth    : num  4342 4334 4286 1438 166 ...
 $ lat        : num  44.6 44.6 44.6 44 44 ...
 $ lon        : num  -4.65 -4.48 -4.3 -2.87 -2.07 -2.13 -2.27 -2.35 -2.42 -2.48 ...
 $ time       : num  8.23 9.68 10.9 19.33 8.78 ...
 $ salinity   : num  35.7 35.7 35.7 35.7 NA ...
 $ flow       : num  417 405 377 420 354 373 375 364 375 362 ...
 $ s.depth    : num  104 98 101 98 101 100 100 102 100 100 ...
 $ temp.surf  : num  15 15.4 15.9 16.6 16.7 16.6 17.1 17.1 17.3 16.9 ...
 $ temp.20m   : num  15 15.4 15.9 16.6 16.7 16.6 17.1 17.1 17.3 16.9 ...
 $ net.area   : num  0.242 0.242 0.242 0.242 0.242 0.242 0.242 0.242 0.242 0.242 ...
 $ country    : Factor w/ 4 levels "EN","fr","IR",..: 4 4 4 4 4 4 4 4 4 4 ...
 $ vessel     : Factor w/ 4 levels "CIRO","COSA",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ vessel.haul: num  22 23 24 93 178 179 181 182 183 184 ...
 $ c.dist     : num  0.84 0.859 0.893 0.396 0.04 ...

loc <- cbind(mack$lon, mack$lat)
plot(loc, cex=2*mack$egg.count/max(mack$egg.count), xlab="Lon", ylab="Lat")
points(loc, pch=19, cex=0.3, col="red")

The assumption of modelling is like below (sorry for simple text…) but this is a conventional GAM with the spatial correlation and environmental covariates.

\[\begin{equation} Y_i \sim Pois(\lambda_i) \end{equation}\] \[\begin{equation} \log \lambda_i = log(net.area) + s(temp.20m) + s(lon, lat) \end{equation}\]

GAM analysis

Res.gam <- gam(egg.count ~ offset(net.area) -1 + s(temp.20m) + s(lon, lat),
               family="poisson", data=mack)

summary(Res.gam)


Family: poisson 
Link function: log 

Formula:
egg.count ~ offset(net.area) - 1 + s(temp.20m) + s(lon, lat)

Approximate significance of smooth terms:
               edf Ref.df Chi.sq p-value    
s(temp.20m)  8.871  8.989    276  <2e-16 ***
s(lon,lat)  28.947 28.999   3154  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.644   Deviance explained = 76.9%
UBRE =  5.833  Scale est. = 1         n = 634

zz <- Res.gam$fitted.values

plot(Res.gam)

quilt.plot(mack$lon, mack$lat, zz, xlab="Lon", ylab="Lat")

ggmap

qmplot(lon, lat, zz, data=mack, zoom = 5)

ggplot

ggplot(mack) + 
  geom_point(aes(x=lon,y=lat,alpha=0.2,size=zz,col=zz)) + 
  coord_quickmap()

Lon.range = c(-15, 0)
Lat.range = c(44, 60)
dat <- getNOAA.bathy(Lon.range[1],Lon.range[2],Lat.range[1],Lat.range[2],res=5)

autoplot(dat, geom=c("r", "c"), colour="white", size=0.1)+
 scale_fill_etopo()+ 
 labs(x="Longitude",y="Latitude",fill="Depth") + ggtitle("Map") +
  geom_point(data=mack,aes(x=lon,y=lat,alpha=0.2,size=zz,col=zz))+
 coord_quickmap()

# Create nice color palettes
blues <- c("lightsteelblue4", "lightsteelblue3",
           "lightsteelblue2", "lightsteelblue1")
greys <- c(grey(0.6), grey(0.93), grey(0.99))

# Mapping
plot(dat, im=TRUE, land=TRUE, bpal=list(c(min(dat),0,blues),c(0,max(dat),greys)),
     lwd=.05, las=1, xlim=c(-15,0))
map("worldHires", res=0, lwd=0.7, add=TRUE)
points(mack$lon, mack$lat, cex=zz/50, pch=19, col=rgb(1,0,0,0.2))

Lecture 9. Production model

Theoretical background in brief

\[ B_{t+1} = B_t + rB_t \left\{1-\left(\frac{B_t}{K}\right)^z \right\} - C_t \]

Assume that the CPUE is proportional to the population biomass as follows: \[ CPUE = q B. \]

Here $q$ is a proportional coefficient called “catchability coefficient”.

Let us assume the following time series model and try to estimate the parameters in the model. \[ \log CPUE_t = \log q + log B_t + \varepsilon_t, \quad \varepsilon_t \sim N(0, \sigma^2) \]

Lecture 10. Hands-on exercises for production model using ML estimation

Reading the data

# Data for Indian Ocean albacore
Data <- read.csv("Data/Albacore.csv", header=T)
Year <- Data$Year
Catch <- Data$Catch
CPUE <- Data$CPUE
CV <- Data$CV
TT <- length(Year)

start.year <- min(Year)
final.year <- max(Year)

Visual presentation of data

par(mai = c(1, 1.1, 1, 1.1))
plot(Year, Catch, type = "h", lwd=5, col="gray", ylab="Catch (tons)")
par(new =T)
plot(Year, CPUE, type="l", lwd=2, col="red", axes=FALSE,  ylab="", ylim=c(0,2))
axis(4); mtext("CPUE", side=4, line=3)

Definition of population dynamics

PDM.PT<-function(r, K, z, B1, TT, Catch)
{
 TT <- TT
 B<-rep(NA,TT)
 B[1]<-B1
 for(t in 1:(TT-1))
 {
  tmp<-B[t]+r*B[t]*(1-(B[t]/K)^z)-Catch[t] 
  B[t+1]<-max(tmp,0.001)
 }
 B
}

Definition of likelihood function (z=1; Schaefer model)

NLL.PT <- function(par,z)
{
 r<-exp(par[1])
 K<-exp(par[2])
 q<-exp(par[3])
 sigma <-exp(par[4])
 B<-PDM.PT(r, K, z, K, TT, Catch)
 
 yobs <- log(CPUE)
 mu <- log(q) + log(B)
 
 pdf <- dnorm(yobs, mu, sigma, log=T)
 loglike <- sum(pdf, na.rm=T)
 negloglike <- -loglike
 return(negloglike)
}

Estimation of parameters by maximum likelihood (ML) methods

init.MSY <- 2*mean(Catch)
init.r <- 0.4
init.K <- 4*init.MSY/init.r

init<-log(c(init.r, init.K, 10^(-6), 1)) 
NLL.PT(init, z=1)

[1] 46.552

NLL.PT(2*init, z=1)

[1] 74.13025

res.SC<-optim(init, NLL.PT, z=1, method="BFGS", control=list(reltol=10^(-8))) 
res <- res.SC; res

$par
[1]  -0.9788076  12.7297969 -12.4127161  -1.4274852

$value
[1] -0.2648306

$counts
function gradient 
     172       51 

$convergence
[1] 0

$message
NULL

r.est<-exp(res$par[1])
K.est<-exp(res$par[2])
q.est<-exp(res$par[3])
sigma.est<-exp(res$par[4])

MSY.est <- r.est*K.est/4
MSYL.est <- K.est/2

data.frame(r.est, K.est, q.est, sigma.est)

      r.est    K.est        q.est sigma.est
1 0.3757589 337660.7 4.066548e-06 0.2399115

Predicted <- PDM.PT(r.est, K.est, z=1, K.est, TT, Catch)

Visual presentation of outcomes (1)

# Fittness 
par(mfrow=c(2,2))
plot(log(CPUE), log(q.est*Predicted), xlim=c(-1,1), ylim=c(-1,1)); abline(0,1)

# Time series plots for CPUE observation and its prediction by the model
plot(Year, CPUE, ylim=c(0, max(CPUE, na.rm=T)) )
points(Year, q.est*Predicted, type="l")

# Times series of estimated biomass
plot(Year, Predicted, type="l", ylim=c(0, 1.1*K.est), xaxs="i",yaxs="i")

# Times series of depletion level against the carrying capacity (B/K)
plot(Year, Predicted/K.est, type="l", ylim=c(0,1), xaxs="i",yaxs="i", ylab="Depletion")
rect(xleft=-1, ybottom=0, xright=2010, ytop=0.4, lwd=0, col=rgb(1,0,0,0.3))
rect(xleft=-1, ybottom=0.4, xright=2010, ytop=0.6, lwd=0, col=rgb(1,1,0,0.3))
rect(xleft=-1, ybottom=0.6, xright=2010, ytop=1, lwd=0, col=rgb(0,1,0, 0.3))

Visual presentation of outcomes (2)

# Definition of B-ratio and F-ratio
Bratio <- Predicted/MSYL.est
Fratio <- Catch/(MSY.est) #actually Hratio

plot(Bratio,Fratio,type="n",xlim=c(0,2),ylim=c(0,2),
     xlab="B/Bmsy",ylab="F/Fmsy", main="Kobe plot")
rect(xleft=0,ybottom=0,xright=1,ytop=1,lwd=0,col="#FEFF63FF")
rect(xleft=0,ybottom=1,xright=1,ytop=2,lwd=0,col="#FF4949FF")
rect(xleft=1,ybottom=1,xright=2,ytop=2,lwd=0,col="#FFBE7DFF")
rect(xleft=1,ybottom=0,xright=2,ytop=1,lwd=0,col="#85FF68FF")

points(Bratio,Fratio,type="l",lwd=1.8)
points(Bratio[1:(T-1)],Fratio[1:(T-1)],lwd=1,cex=3,pch=21,bg="skyblue")
points(Bratio[T],Fratio[T],lwd=1,cex=3,pch=21,bg="blue")
num <- c(seq((start.year-1900),99,1),seq(0,(final.year-2000),1))
text(Bratio,Fratio,labels=num,adj=0.5,cex=1, col="white")

Lecture 11. Hands-on exercises for state-space production models

Installation of JAGS

Please download rjags_4-3.zip from

http://sourceforge.net/projects/mcmc-jags/files/rjags/4/

And, in R session

#install.packages("rjags")
#library(rjags) 
#
#install.packages("runjags")
#library(runjags) 

#install.packages("jagsUI")
library(jagsUI)

Data for Indian Ocean albacore

Data <- read.csv("Example_BSPM/ALB_DB_5016.csv", header=T)
Year <- Data$Year
Catch <- Data$Catch
CPUE.R3 <- Data$CPUE_R3
CPUE.R4 <- Data$CPUE_R4
TT <- length(Year)

par(mai = c(1, 1.1, 1, 1.1))
plot(Year, Catch, type = "h", lwd=5, col="gray", ylab="Catch (tons)")
par(new =T)
plot(Year, CPUE.R3, type="l", lwd=2, col="red", axes=FALSE,  ylab="", ylim=c(0,2))
points(Year, CPUE.R4, type="l", lwd=2, col="blue")
axis(4); mtext("CPUE", side=4, line=3)

start.year <- Year[1]
final.year <- Year[TT]
Year.est <- start.year:(final.year+1)

CPUE.R3.ind<-which(!is.na(CPUE.R3))
CPUE.R3_rem <- CPUE.R3[CPUE.R3.ind]

CPUE.R4.ind<-which(!is.na(CPUE.R4))
CPUE.R4_rem <- CPUE.R4[CPUE.R4.ind]

N <- length(Year.est)
N.R3 <- length(CPUE.R3_rem)
N.R4 <- length(CPUE.R4_rem)

JAGS run (using package “jagsUI”“)

Set-up

#Data
JAGS.data <- list(N1=N.R3,N2=N.R4,N=N,Catch=Catch,CPUE.1=CPUE.R3_rem,CPUE.2=CPUE.R4_rem,Period_CPUE.1=CPUE.R3.ind,Period_CPUE.2=CPUE.R4.ind,Dep=1)

#Parameters
Params<-c("B","D","F","Fratio","Bratio","CPUE.1.est","CPUE.2.est","r","K","MSY","BMSY","FMSY","inv.sigma2_CPUE.1","inv.sigma2_CPUE.2","inv.tau2","q_CPUE.1","q_CPUE.2","sigma_CPUE.1","sigma_CPUE.2","tau")

Preparation of a file for “model and estimation”

sink("Example_BSPM/BSPM.txt")
cat("
model{

##observation-process

##CPUE.1
for(i in 1:N1){
 CPUE.1[i] ~ dlnorm(L_CPUE.1.est[i],inv.sigma2_CPUE.1)
 L_CPUE.1.est[i] <- log(CPUE.1.est[i])
 CPUE.1.est[i] <- q_CPUE.1*D[Period_CPUE.1[i]]*K
}

##CPUE.2
for(i in 1:N2){
 CPUE.2[i] ~ dlnorm(L_CPUE.2.est[i],inv.sigma2_CPUE.2) 
 L_CPUE.2.est[i] <- log(CPUE.2.est[i])
 CPUE.2.est[i] <- q_CPUE.2*D[Period_CPUE.2[i]]*K
}

##state-process(1950-2016)
for(i in 2:N){
 D[i] ~ dlnorm(Dtmp[i],inv.tau2)
 tmp[i] <- D[i-1]+r*D[i-1]*beta[i-1]-Catch[i-1]/K
 beta[i-1] <- 1-D[i-1]
 Dtmp[i] <- log(max(tmp[i],0.001))
}

##B(1950-2016)
for(i in 1:N){
 B[i] <- K*D[i]
 Bratio[i] <- B[i]/BMSY 
}

##F(1950-2015)
for(i in 1:(N-1)){
 F[i] <- Catch[i]/B[i]
 Fratio[i] <- F[i]/FMSY
}

##projection of CPUE.1
for(i in 1:N1){
 CPUE.1.pred[i] ~ dlnorm(L_CPUE.1.est[i],inv.sigma2_CPUE.1)
}

##projection of CPUE.2
for(i in 1:N2){
 CPUE.2.pred[i] ~ dlnorm(L_CPUE.2.est[i],inv.sigma2_CPUE.2)
}

##prior
r~dunif(0,3)
K~dunif(0,10000000)

D[1] ~ dlnorm(L_Dep,inv.tau2)
L_Dep <- 0

q_CPUE.1~dunif(0,0.01)
q_CPUE.2~dunif(0,0.01)

inv.sigma2_CPUE.1 <- 1/(sigma_CPUE.1*sigma_CPUE.1)
sigma_CPUE.1 ~dunif(0,2)

inv.sigma2_CPUE.2 <- 1/(sigma_CPUE.2*sigma_CPUE.2)
sigma_CPUE.2 ~dunif(0,2)

inv.tau2 <- 1/(tau*tau)
tau ~dunif(0,2)

MSY <- (r*K)/4
BMSY <- K/2
FMSY <- MSY/BMSY

}
", fill=TRUE)
sink()

RUN

# Set WD
setwd(NewDir)

# MCMC option
n.chains <- 3
n.burnin <- 50000
n.thin <- 100
n.iter <- 350000

# Definition of model run
res <- jags(data=JAGS.data, model.file="BSPM.txt",
              parameters.to.save=Params, n.chains=n.chains,
              n.burnin=n.burnin, n.thin=n.thin, n.iter=n.iter)
saveRDS(res, file = "res.obj")

Results

#res <- readRDS("Example_BSPM/res.obj")
B.med <- res$q50$B
B.LB <- res$q2.5$B
B.UB <- res$q97.5$B

K.med <- res$q50$K

BMSY.med <- res$q50$BMSY

D.med <- res$q50$D
D.LB <- res$q2.5$D
D.UB <- res$q97.5$D

plot(Year.est, B.med, type="l", ylim=c(0, 1.2*K.est),
     xaxs="i",yaxs="i", ylab="Biomass (tons)")
points(Year.est, B.LB, type="l", lty=2)
points(Year.est, B.UB, type="l", lty=2)
abline(K.med,0)
abline(BMSY.med,0)

RED<-rgb(1, 0, 0, alpha=0.2)
YEL<-rgb(1, 1, 0, alpha=0.2)
GRE<-rgb(0, 1, 0, alpha=0.2)
plot(Year.est, D.med, type="l", ylim=c(0,1.2),
     xaxs="i",yaxs="i", ylab="Depletion")
rect(xleft=-1, ybottom=0, xright=2020, ytop=0.4, lwd=0, col=RED)
rect(xleft=-1, ybottom=0.4, xright=2020, ytop=0.6, lwd=0, col=YEL)
rect(xleft=-1, ybottom=0.6, xright=2020, ytop=2, lwd=0, col=GRE)
points(Year.est, D.LB, type="l", lty=2)
points(Year.est, D.UB, type="l", lty=2)
abline(1,0)

Kobe plot

#we need these packages
library(reshape)
library(MASS)

Kobe_CI <- function(start.year,final.year,Bratio,Fratio,rec.Bratio,rec.Fratio,plot_ylim){
  
  T <- (final.year-start.year)+1
  Year <- start.year:final.year
  xy <- data.frame(x=rec.Bratio,y=rec.Fratio)
  xy <- subset(xy,xy$y<=plot_ylim)
  xx <- xy$x
  yy <- xy$y
  d <- kde2d(xx,yy,c(bandwidth.nrd(xx),bandwidth.nrd(yy)),n=100)
  tes<-function(x,y)(sum(d$z[d$z>x])*(max(d$x)-min(d$x))/100*(max(d$y)-min(d$y))/100)/(sum(d$z)*(max(d$x)-min(d$x))/100*(max(d$y)-min(d$y))/100)-y
  d2<-d
  H<-seq(0.01,1,0.01)
  H<-rev(H)
  for(i in 1:99){
    h1<-uniroot(tes,range(d),H[i])$root
    d2$z[h1<d$z]<-H[i]
  }
  h1<-uniroot(tes,range(d),H[100])$root
  d2$z[h1<d$z]<-H[100]
  
  par(cex.axis=1,cex.lab=1)
  par(mar=c(4,4,1,1))
  
  plot(Bratio,Fratio,type="n",xlim=c(0,plot_ylim),ylim=c(0,plot_ylim),xlab="B/Bmsy",ylab="F/Fmsy",bty="n")
  rect(xleft=0,ybottom=0,xright=1,ytop=1,lwd=0,col="#FEFF63FF")
  rect(xleft=0,ybottom=1,xright=1,ytop=plot_ylim,lwd=0,col="#FF4949FF")
  rect(xleft=1,ybottom=1,xright=plot_ylim,ytop=plot_ylim,lwd=0,col="#FFBE7DFF")
  rect(xleft=1,ybottom=0,xright=plot_ylim,ytop=1,lwd=0,col="#85FF68FF")
  .filled.contour(d2$x,d2$y,d2$z,col=rgb(0.9, 0.9, 0.9, alpha=0.8),levels=c(0,0.8))
  contour(d2,levels=0.8,add=T,lty=0,drawlabels=F)
  points(Bratio,Fratio,type="l",lwd=1.8)
  points(Bratio[1:10],Fratio[1:10],lwd=1,cex=1.5,col="black",pch=21,bg="slateblue")
  points(Bratio[11:20],Fratio[11:20],lwd=1,cex=1.5,col="black",pch=21,bg="cadetblue")
  points(Bratio[21:30],Fratio[21:30],lwd=1,cex=1.5,col="black",pch=21,bg="deepskyblue")
  points(Bratio[31:(T-1)],Fratio[31:(T-1)],lwd=1,cex=1.5,col="black",pch=21,bg="lightblue1")
  points(Bratio[T],Fratio[T],lwd=1,cex=1.5,col="black",pch=21,bg="white")
  num <- c(seq((start.year-1900),99,1),seq(0,(final.year-2000),1))
  text(Bratio,Fratio,labels=num,adj=0.5,cex=0.5)
  legend(plot_ylim,plot_ylim-1.2,paste(c("80%CI")),pt.bg=rgb(0.9, 0.9, 0.9, alpha=0.8),bty="n",cex=0.8,pch=22,pt.cex=2,xjust=1,yjust=1)
  #par(mar=c(6,6,0,0))
  par(new=T,fig=c(0.3,0.9,0.45,0.95))
  ratio.d <- data.frame(x=rec.Bratio,y=rec.Fratio)
  ratio.red <- subset(ratio.d, 0<=x & x<=1 & 1<=y)
  ratio.ora <- subset(ratio.d,1<=x & 1<=y)
  ratio.yel <- subset(ratio.d,0<=x & x<=1 & 0<=y & y<=1)
  ratio.gre <- subset(ratio.d,1<=x & 0<=y & y<=1)
  pie(c(length(ratio.gre$x)/length(ratio.d$x),length(ratio.ora$x)/length(ratio.d$x),length(ratio.red$x)/length(ratio.d$x),length(ratio.yel$x)/length(ratio.d$x)),labels=round(c(length(ratio.gre$x)/length(ratio.d$x),length(ratio.ora$x)/length(ratio.d$x),length(ratio.red$x)/length(ratio.d$x),length(ratio.yel$x)/length(ratio.d$x)),3),radius=0.9,cex=0.7,col=c("#85FF68FF","#FFBE7DFF","#FF4949FF","#FEFF63FF"),cex.lab=0.8)
}

Kobe_CI(1950,2016,res$q50$Bratio[-N],res$q50$Fratio,res$sims.list$Bratio[,(N-1)],res$sims.list$Fratio[,(N-1)],3)

(to come later)

setwd(CurrentDir)

Lecture 12. Age-structured population dynamics

Lecture 13. Hands-on exercises for age-structured models

For working “age-structured model”, mostly you have to combine multiple sources of information such as CPUE series and age- or length-composition of catch as well as the total catch. There have been several candidate methods with developed software. Stock Synthesis 3 (SS3) is one of them. I have been involved in some tuna RFMO, and the SS3 is one of most common platforms, especially in IOTC.

For formal use of SS3, you need to register USA’s NOAA site to download necessary “exe” file etc. And the set-up and preparation of data are not straightforward for use of your own data, and I don’t think all of the participants are using such an advanced method. Therefore, I will briefly demonstrate how to run. That being said, I will leave a memo for installation and test runs. Please feel free to ask me more.

Installation of SS3

For downloading SS3, please visit the following website and register for “Stock Synthesis version 3.0”.

https://vlab.ncep.noaa.gov/web/stock-synthesis

Installation of “r4ss” for graphical outputs of SS3 outcomes

Previously “r4ss” was on CRAN to download, but it has not been renewed currently. So we shall install it from a GitHub site using a package “devtools”. After installation, you can call the library.

install.packages("devtools")
devtools::install_github("r4ss/r4ss", force = TRUE, build_vignettes = TRUE)

library(r4ss)
#browseVignettes("r4ss")

Unless you see “error” messages, no problems (you don’t need to worry about “warning”). See https://github.com/r4ss/r4ss more details.

At first, we will start demonstration to see what results are available by executing SS3. We shall use a set of example files stored in r4ss package.

“SS_output”: A function to create a list object for the output from Stock Synthesis
“SS_plots”: A function to plot many quantities related to output from Stock Synthesis

To see the syntax, you call call the help files as follows:

?SS_output
?SS_plots

Running “SS3” and then use “r4ss” (using an example “Simple”)

The files necessary to run SS3 are as follows:

Starter file (starter.ss)
Data file (Filename.dat or any name)
Control file (Filename.ctl or any name)
Forecast file (forecaset.ss)

While you can give any names for the “data” and “control” files, please use “Starter” and “Forecast” files as they are. When you open “starter.ss” file,

Filename.dat

Filename.ctl

are described at the first two effective lines.

Now, follow the next a couple of steps:

please create a new folder (e.g. on the desktop) of name “Example_SS3”
create another folder “Simple” under the “Example_SS3” and save the four files by copying
please copy and paste SS3.30’s executed file “ss_opt.exe” (downloaded from the virtual Lab of NOAA) into “Simple”
Open “CommandPrompt” and run “ss_opt”

As you can see the first line, there is a warning message “Some stats skipped because the .par file not found”. The SS3 code is written by ADMB, and usually once an ADMB code is compiled and then executed, several important files are created:

“.par” to save parameter estimates
“.std” to save estimated quantities (including the parameter themselves) and standard errors (SEs) assessed by the inverse of Fisher information matrix and the delta method
“.cor” to save the correlation matrix of the estimated quantities
“.rep” to save the report file, which format is written in the ADMB code

Hope you succeed in computation. Then use “r4ss” to draw outputs. Sorry for letting you the same thing twice, but the current way is the usual SS3 procedure.

#Run "ss_opt.exe"
path <- getwd()
mywd <- file.path(path, "/Example_SS3/Simple")   ## for you!
replist <- SS_output(dir=mywd)
SS_plots(replist, png=T, pdf=F) #default plot

Just in case, if you prefer “pdf” to “html” as an output format, run as follows:

SS_plots(replist, png=F, pdf=T)

% SS3_res1

% SS3_res2

Lecture 14. Overview of management strategy evaluation (see ppt)

Lecture 15. Exercises of simple MSE using R

I will distribute a set of files for this shiny demonstration on site.

# Define UI for application that draws a histogram
ui <- fluidPage(
  
  ## Title line
  titlePanel("Example of Management Strategy Evaluation"),
  
  ## First line
  fluidRow(
    column(6, wellPanel(
      textInput("Objectives", h4("What is your management objective?"),
                value="Enter your objectives"),
      numericInput(inputId = "Objective1", label = "Prob(B > Bmsy) = ", 
                   value = 0.5, step=0.1, min=0, max=1)
    )), 
    column(6, wellPanel(
      numericInput(inputId = "Nyear", label = "Management Period:", 
                   value = 10, step=1, width = "50%"),
      numericInput(inputId = "Nsim", label = "Number of simulation replicas:",
                   value = 500, max=1000, step=100, width = "50%")
    )) 
  ),
  
  ## Second line
  fluidRow(
    
    column(3, wellPanel(
      h4("Configulation of Operating Models"), 
      numericInput(inputId = "Dinitial", label = "Initial Depletion (D0):", 
                   value = 0.4, step=0.05, min=0.1, max=0.9),
      numericInput(inputId = "r", label = "Intrinsic Rate of Increase (r):", 
                   value = 0.1, step=0.01, min=0.01, max=0.5),
      numericInput(inputId = "K", label = "Carrying Capacity (K):", 
                   value = 10000, step=1000),
      numericInput(inputId = "z", label = "Shape Parameter (z):", 
                   value = 1, step=0.1, min=0.1, max=3),
      numericInput(inputId = "EstCV", label = "Precision of Abun Estimate (CV):", 
                   value = 0.1, step=0.05, min=0, max=0.5), 
      numericInput(inputId = "ProCV", label = "Extent of Process Error (tau):",
                   value = 0.05, step=0.01, min=0, max=0.2)
    )),
    
    column(3, wellPanel(
      h4("Setting your Management Procedure"), 
      numericInput(inputId = "LRP", label = "Limit Reference Point:", 
                   value = 0.1, step=0.05, min=0, max=0.5),
      numericInput(inputId = "TRP", label = "Target Reference Point:", 
                   value = 0.4, step=0.05),
      numericInput(inputId = "MaxChange", label="Max Allowable Change in catch (%):", 
                   value = 30, step=5),
      selectInput(inputId = "Assessment", 
                  label = "Do assessment? (this does not work yet):", 
                  choices = list("Yes, parameters are updated"=1, 
                                 "No, assessment is not conducted"=2), selected=1),
      submitButton("Run"),
      downloadButton("report", "Download PDF")
    )),
    
    column(6, wellPanel(
      h4("Simulation results"), 
      tabsetPanel(
        tabPanel("Performance Mesures",tableOutput("Performance")),
        tabPanel("Your Harvest Control Rule", plotOutput("HCR")),
        tabPanel("Kobe Matrix", tableOutput("Kobe2"))
      )
    ))
  ), 
  
  ## Third line 
  fluidRow(
    column(4,wellPanel(
      plotOutput("Trajectories_B")
    )),
    
    column(4,wellPanel(
      plotOutput("Trajectories_F")
    )),
    
    column(4,wellPanel(
      plotOutput("Kobe1")
    ))
  )
  
)

# Define server logic required to draw a histogram
server <- function(input, output){
  
  Simulation <- reactive({
    
    #Model-based
    HCR <- function(Bcurrent, Dep, rest, Kest, zest, LRP, TRP)
    {
      Fmsy <- rest*(1/(1+zest))^(1/zest)
      if(TRP > LRP){
        if(Dep <= LRP) Quota <- 0
        if(Dep > LRP & Dep < TRP) Quota <- (Fmsy/(TRP-LRP))*(Dep-LRP)*Bcurrent
        if(Dep >= TRP) Quota <- Fmsy*Bcurrent
      }
      if(TRP-LRP==0){
       if(Dep <= LRP) Quota <- 0
      if(Dep >= LRP) Quota <- Fmsy*Bcurrent             
      }
      return(Quota)
    }
    
    #Pella-Tomlinson Production model (incl. Shaefer model)
    PDM.PT <- function(r, K, z, Dinitial, ProCV, Nyear)
    {
      Pmsy <- K*(1/(1+z))^(1/z)
      Fmsy <<- r*(z/(1+z))
      P <- numeric()
      Catch <- numeric()
      Bratio <- numeric()
      Fratio <- numeric()
      ProError <- rnorm(Nyear,0,ProCV)
      P[1] <- K*Dinitial
      for(t in 1:Nyear)
      {
        Pest <- P[t]*exp(rnorm(1,0,EstCV))
        Catch[t] <- HCR(Pest, Pest/K, r, K, z, LRP, TRP)
        if(t>=2){ 
          if(Catch[t-1]>0) change <- (Catch[t] - Catch[t-1])/Catch[t-1] 
          if(Catch[t-1]==0) change <- 0
          if(change > MaxChange/100) Catch[t] <- (1+MaxChange/100)*Catch[t-1]
          if(change < -MaxChange/100) Catch[t] <- (1-MaxChange/100)*Catch[t-1]
        }
        tmp <- P[t] + r*P[t]*(1-(P[t]/K)^z) - Catch[t]
        P[t+1] <- max(tmp*exp(ProError[t]),1)
      }
      Out <- NULL
      Out$P <- P
      Out$Bratio <- P/Pmsy
      Out$Fratio <- Catch/(Fmsy*Pmsy)
      Out$Catch <- Catch
      return(Out)
    }
    
    Nyear <<- input$Nyear
    Dinitial <<- input$Dinitial
    r <<- input$r
    K <<- input$K
    z <<- input$z
    ProCV <<- input$ProCV
    EstCV <<- input$EstCV
    LRP <<- input$LRP
    TRP <<- input$TRP
    MaxChange <<- input$MaxChange
    Nsim <<- input$Nsim
    
    P.mat <- array(0,c(Nsim, Nyear+1))
    Bratio.mat <- array(0,c(Nsim, Nyear+1))
    Catch.mat <- array(0,c(Nsim, Nyear))
    Fratio.mat <- array(0,c(Nsim, Nyear))
    for(i in 1:Nsim){
      Res <- PDM.PT(r,K,z,Dinitial,ProCV,Nyear)
      P.mat[i,] <- Res$P
      Bratio.mat[i,] <- Res$Bratio
      Fratio.mat[i,] <- Res$Fratio
      Catch.mat[i,] <- Res$Catch
    }
    SimRes<-NULL
    SimRes$P.mat <- P.mat
    SimRes$Bratio.mat <- Bratio.mat
    SimRes$Catch.mat <- Catch.mat
    SimRes$Fratio.mat <- Fratio.mat
    SimRes$P.med <- apply(P.mat,2,median)
    SimRes$Bratio.med <- apply(Bratio.mat,2,median)
    SimRes$Fratio.med <- apply(Fratio.mat,2,median)
    return(SimRes)
    
  })
  
  output$Trajectories_B <- renderPlot({
    
    SimRes <- Simulation()
    par(mfrow=c(3,1), mar=c(2,2,2,2))
    
    plot(0,0, main="Population size", xlim=c(0,Nyear), 
         ylim=c(0,K*1.1), xaxs="i", yaxs="i")
    for(i in 1:Nsim) points(0:Nyear, SimRes$P.mat[i,], type="l", col="lightgray")
    abline(K,0,lty=2)
    points(0:Nyear, SimRes$P.med, type="l", col="blue", lwd=2)
    
    plot(0,0, main="Depletion level", xlim=c(0,Nyear), 
         ylim=c(0,1.1), xaxs="i", yaxs="i")
    for(i in 1:Nsim) points(0:Nyear, SimRes$P.mat[i,]/K, type="l", col="lightgray")
    points(0:Nyear, SimRes$P.med/K, type="l", col="blue", lwd=2)
    rect(-100,0,100,LRP, col=rgb(1, 0, 0, alpha=0.2))
    rect(-100,LRP,100,TRP, col=rgb(1, 1, 0, alpha=0.2))
    rect(-100,TRP,100,10, col=rgb(0, 1, 0, alpha=0.2))
    
    plot(0,0, main="B-ratio", xlim=c(0,Nyear),
         ylim=c(0,max(c(max(SimRes$Bratio.mat)*1.1,2))), xaxs="i", yaxs="i")
    for(i in 1:Nsim)  points(0:Nyear, SimRes$Bratio.mat[i,], 
                             type="l", col="lightgray")
    points(0:Nyear, SimRes$Bratio.med, type="l", col="blue", lwd=2)
    rect(-100,0,100,1, col=rgb(1, 0, 0, alpha=0.2))
    rect(-100,1,100,100, col=rgb(0, 1, 0, alpha=0.2))
    
  })
  
  output$Trajectories_F <- renderPlot({
    
    SimRes <- Simulation()
    par(mfrow=c(3,1), mar=c(2,2,2,2))
    
    plot(0,0,main="Catch", xlim=c(0,Nyear), ylim=c(0,max(SimRes$Catch)*1.1),
         xaxs="i", yaxs="i")
    for(i in 1:Nsim) points(1:Nyear, SimRes$Catch[i,], type="l", col="lightgray")
    Catch.med <- apply(SimRes$Catch.mat,2,median)
    points(1:Nyear, Catch.med, type="l", col="blue", lwd=2)
    
    plot(0,0,main="Change in catch (%)", xlim=c(0,Nyear), ylim=c(-100,100),
         xaxs="i", yaxs="i")
    for(i in 1:Nsim){
      points(1:(Nyear-1),
         100*(SimRes$Catch[i,-1]-SimRes$Catch[i,-Nyear])/SimRes$Catch[i,-Nyear], 
         type="l", col="lightgray")
    abline(MaxChange, 0, lty=2, col="blue")
    abline(0, 0, lty=2, col="gray")
    abline(-MaxChange, 0, lty=2, col="blue")
    
    plot(0,0,main="F-ratio", xlim=c(0,Nyear),
         ylim=c(0,max(c(max(SimRes$Fratio.mat)*1.1, 2))), xaxs="i", yaxs="i")
    for(i in 1:Nsim){ 
      points(1:Nyear, SimRes$Fratio.mat[i,], type="l", col="lightgray")}
    points(1:Nyear, SimRes$Fratio.med, type="l", col="blue", lwd=2)
    rect(-100,0,100,1, col=rgb(0, 1, 0, alpha=0.2))
    rect(-100,1,100,100, col=rgb(1, 0, 0, alpha=0.2))
    
  })
  
  output$Kobe1 <- renderPlot({
    
    SimRes <- Simulation()    
    par(mfrow=c(1,1), xaxs="i", yaxs="i")
    
    Bratio.med <- SimRes$Bratio.med[-(Nyear+1)]
    Fratio.med <- SimRes$Fratio.med
    
    xmax<-max(Bratio.med*1.05,2)
    ymax<-max(Fratio.med*1.05,2)
    plot(Bratio.med, Fratio.med, xlab="B-ratio",
         ylab="F-ratio",xlim=c(0,xmax),ylim=c(0,ymax),col="red")
    rect(0,1,1,ymax,col="lightsalmon")
    rect(0,0,1,1,col="lightyellow")
    rect(1,1,xmax,ymax,col="lightyellow")
    rect(1,0,xmax,1,col="lightgreen")
    points(SimRes$Bratio.mat[,Nyear],SimRes$Fratio.mat[,Nyear], 
           pch=19, cex=0.5, col="gray")
    lines(Bratio.med,Fratio.med,type="o")
    
  })
  
  output$Performance <- renderTable({
    SimRes <- Simulation()   
    Catch.med <- apply(SimRes$Catch.mat,2,median)
    judge <- function(x) mean(x>1)
    Prob1 <- apply(SimRes$Bratio.mat[,-1], 2, judge)
    Prob2 <- apply(SimRes$Fratio.mat, 2, judge)
    tmp<- data.frame(Year=1:Nyear, B.med=SimRes$P.med[-1], 
                Catch.med=Catch.med, Bratio.med=SimRes$Bratio.med[-1], 
                Fratio.med=SimRes$Fratio.med, 
                "Prob B greater than Bmsy="= Prob1, 
                "Prob F smaller than Fmsy" = 1-Prob2)
    return(tmp)
  })
  
  output$HCR <- renderPlot({
    SimRes <- Simulation()
    plot(0,0,type="n",xlim=c(0,1), ylim=c(0,Fmsy*1.5), 
         xlab="Depeletion level", ylab="Fishing Intensity")
    lines(c(0,LRP),c(0,0),lwd=2)
    lines(c(LRP,TRP),c(0,Fmsy),lwd=2)
    lines(c(TRP,1),c(Fmsy,Fmsy),lwd=2)
    lines(c(TRP,TRP),c(0,Fmsy),lty=2)
    lines(c(0,TRP),c(Fmsy,Fmsy),lty=2)
  })
  
  output$report <- downloadHandler(
    
    filename = function(){
      paste("MSE",format(Sys.time(), "%d-%b-%Y %H.%M"), ".pdf", sep=".")
    },
    content = function(file){
      pdf(file)
      hist(rnorm(10000,0,1), main="Temp")
      dev.off()
    }
  )

}

# Run the application 
shinyApp(ui = ui, server = server)

% cran

Lecture 16. Data-limited stock assessment

AVAILABLE SOFTWARE

Lecture 17. Hands-on exercises for Data-limited stock assessment

Preparation

A function “catchmsy” in “fishmethods” package

The main package is “fishmethods”, we shall call another useful package specific to fishery stock assessment such as “FSA (http://derekogle.com/fishR/packages)” to use some convenient functions in it.

#"fishmethods"
#install.packages("fishmethods")
library(fishmethods)

#additional (https://www.rdocumentation.org/packages/FSA/versions/0.8.22)
#install.packages("FSA")
library(FSA)

#data(package="fishmethods", verbose=F)

Other own functions

Need population dynamics models to calculate biomass trajectory with CI.

PDM.SC <- function(par,Catch)
{
  Nyear <- length(Catch)+1
  r <- par[1]
  K <- par[2]
  Dinitial <- par[3]
  P <- numeric(Nyear)
  P[1] <- K*Dinitial
  for(t in 2:Nyear)
  {
    tmp <- P[t-1] + r*P[t-1]*(1-P[t-1]/K) - Catch[t-1]
    P[t] <- max(tmp,1)
  }
  #Out <- NULL
  #Out$Biomass <- P
  #Out$bbmsy <- P/(K/2)
  Out <- P
  return(Out)
}

A function “cmsy” in “datalimited” package

#Another package "datalimited" (https://rdrr.io/github/datalimited/datalimited/)
#Note: need installation of JAGS () 
#install.packages("devtools")
#devtools::install_github("datalimited/datalimited")
library("datalimited")

EXAMPLES (1) Lingcod with “catchmsy” in “fishmethods” package and “cmsy” in “datalimited” package

Data

Lingcod catch data from literature sources in Martell and Froese (2012).

CurrentDir <- getwd()
NewDir <- paste(CurrentDir, "/DataLimit/lingcod", sep="")
#dir.create(NewDir)
setwd(NewDir)

data(lingcod)
headtail(lingcod, 5)

    year catch
1   1889 150.0
2   1890 100.0
3   1891 150.0
4   1892 300.0
5   1893 170.0
109 1997  57.2
110 1998  67.7
111 1999  71.2
112 2000  46.3
113 2001  46.3

summary(lingcod$catch)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   24.8   310.5   853.1  1155.0  1588.4  4339.0

catchmsy

set.seed(2019)
nsims <- 1000 #change into 10000 in actual computation
par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.lingcod <-catchmsy(
  year=lingcod$year,
  catch=lingcod$catch, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.8,up=0.8,step=0),
  lt=list(low=0.01,up=0.25,refyr=2002),sigv=0,
  k=list(dist="unif",low=4333,up=433300,mean=0,sd=0),
  r=list(dist="unif",low=0.015,up=0.1,mean=0,sd=0),
  M=list(dist="unif",low=0.18,up=0.18,mean=0.00,sd=0.00),
  nsims=nsims,
  graphs=1:11)

print(outpt$Initial)

          Distr Lower  Upper Mean   SD
l0    increment   0.8    0.8 <NA> <NA>
k          unif  4333 433300    0    0
r          unif 0.015    0.1    0    0
M          unif  0.18   0.18    0    0
lt         <NA>  0.01   0.25 <NA> <NA>
refyr      <NA>  2002   <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1) 2.5% (ll=1) 97.5% (ll=1)
k  127275.678   130452.4557 105084.4662  142118.2377
r       0.021        0.0181      0.0156       0.0315
M       0.180        0.1800      0.1800       0.1800

print(outpt$Estimates)

          Mean     Median       2.5%      97.5%
MSY    654.224   619.5212   549.6240   844.3805
BMSY 63637.839 65226.2278 52542.2331 71059.1188
FMSY     0.011     0.0091     0.0078     0.0157
Umsy     0.010     0.0083     0.0071     0.0143
OFL    189.364   195.8207    26.2610   291.1411

cmsy

cmsy(yr, ct, interyr_index = 2L, interbio = c(0, 1), bio_step = 0.05, start_r = resilience(NA), start_k = c(max(ct), 50 * max(ct)), startbio = if (ct[1]/max(ct) < 0.2) c(0.5, 0.9) else c(0.2, 0.6), sig_r = 0.05, reps = 1e+05, revise_bounds = TRUE, finalbio = if (ct[length(ct)]/max(ct) > 0.5) c(0.3, 0.7) else c(0.01, 0.4))

set.seed(2019)
x <- x.lingcod <- cmsy(
  yr=lingcod$year, 
  ct=lingcod$catch, 
  start_r = c(0.015,0.1), 
  start_k = c(max(lingcod$catch), 100*max(lingcod$catch)),
  reps=1e4)

par(mfrow = c(2, 2))
plot(lingcod$year, lingcod$catch, type="o", xlab="Year", ylab="Catch (t)")
plot(lingcod$year, apply(x$biomass, 2, median)[-1], type="o", ylab="Estimated biomass", xlab="Year")
hist(x$bmsy)
plot(x$theta$r, x$theta$k, col = "#00000030")

Comparison

“likeli”
“l0”
“k”
“r”
“M”
“MSY”
“Bmsy”
“Fmsy” “Umsy” “OFL”
“Btk”

par(mfrow=c(2,2))

#### catchmsy
Y1 <- c(lingcod$year, tail(lingcod$year,1)+1)
Accepted <- (outpt$Values$like==1)
Est.accepted <- as.matrix(outpt$Values[Accepted,c(4,3,2)])

Bio.accepted <- apply(Est.accepted, 1, PDM.SC, Catch=lingcod$catch)
Bio_qunantile <- apply(Bio.accepted, 1, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
Bio_qunantile <- data.frame(Year=Y1, Bio=t(Bio_qunantile))
colnames(Bio_qunantile) <-
  c("Year", "Bio_q2.5", "Bio_q25", "Bio_q50", "Bio_q75", "Bio_q97.5")
#headtail(Bio_qunantile, 5)

Bmsy.accepted <- matrix(rep(Est.accepted[,2]/2, length(Y1)), byrow=T, nrow=length(Y1))

bbmsy.accepted <- apply(Est.accepted, 1, PDM.SC, Catch=lingcod$catch)/Bmsy.accepted
bbmsy_qunantile <- apply(bbmsy.accepted, 1, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
bbmsy_qunantile <- data.frame(Year=Y1, Bio=t(bbmsy_qunantile))
colnames(bbmsy_qunantile) <- 
  c("Year", "bbmsy_q2.5", "bbmsy_q25", "bbmsy_q50", "bbmsy_q75", "bbmsy_q97.5")
#headtail(bbmsy_qunantile, 5)

tmp1 <- ggplot(Bio_qunantile, aes(Year, Bio_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=Bio_q25, ymax=Bio_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=Bio_q2.5, ymax=Bio_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

tmp2 <- ggplot(bbmsy_qunantile, aes(Year, bbmsy_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=bbmsy_q25, ymax=bbmsy_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=bbmsy_q2.5, ymax=bbmsy_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

#### cmsy
Bio_qunantile <- apply(x$biomass, 2, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
Bio_qunantile <- data.frame(Year=Y1, Bio=t(Bio_qunantile))
colnames(Bio_qunantile) <-
  c("Year", "Bio_q2.5", "Bio_q25", "Bio_q50", "Bio_q75", "Bio_q97.5")
#headtail(Bio_qunantile, 5)

tmp3 <- ggplot(Bio_qunantile, aes(Year, Bio_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=Bio_q25, ymax=Bio_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=Bio_q2.5, ymax=Bio_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

tmp4 <- ggplot(x$bbmsy, aes(year, bbmsy_q50)) + geom_line()  +
    geom_ribbon(aes(ymin = bbmsy_q25, ymax = bbmsy_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin = bbmsy_q2.5, ymax = bbmsy_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

grid.arrange(tmp1, tmp2, tmp3, tmp4, nrow=2, ncol=2)

rm(outpt)
rm(x)
setwd(CurrentDir)

Blue Grenadier from Southeast Australia

Ricard, D., Minto, C., Jensen, O. P., & Baum, J. K. (2012). Examining the knowledge base and status of commercially exploited marine species with the RAM Legacy Stock Assessment Database. Fish and Fisheries, 13(4), 3 80-398.

http://doi.org/10.1111/j.1467-2979.2011.00435.x. http://ramlegacy.org

CurrentDir <- getwd()
NewDir <- paste(CurrentDir, "/DataLimit/blue_gren", sep="")
#dir.create(NewDir)
setwd(NewDir)

data(blue_gren)
headtail(blue_gren, 5)

     yr   ct
1  1979  512
2  1980  865
3  1981  511
4  1982  829
5  1983  935
29 2007 3190
30 2008 3960
31 2009 3290
32 2010 4220
33 2011 4220

summary(blue_gren$ct)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    511    2770    3370    4009    4540    9550

catchmsy

par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.blue_gren <-catchmsy(
  year=blue_gren$yr,
  catch=blue_gren$ct, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.8,up=1.0,step=0),
  lt=list(low=0.1,up=0.5,refyr=2012),sigv=0,
  k=list(dist="unif",low=2*max(blue_gren$ct),up=10*max(blue_gren$ct),mean=0,sd=0),
  r=list(dist="unif",low=0.015,up=0.2,mean=0,sd=0),
  M=list(dist="unif",low=0.18,up=0.18,mean=0.00,sd=0.00),
  nsims=nsims)

print(outpt$Initial)

          Distr Lower Upper Mean   SD
l0    increment   0.8     1 <NA> <NA>
k          unif 19100 95500    0    0
r          unif 0.015   0.2    0    0
M          unif  0.18  0.18    0    0
lt         <NA>   0.1   0.5 <NA> <NA>
refyr      <NA>  2012  <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1) 2.5% (ll=1) 97.5% (ll=1)
k    87441.38    88301.7915  76385.4376    95351.051
r        0.17        0.1745      0.1236        0.197
M        0.18        0.1800      0.1800        0.180

print(outpt$Estimates)

          Mean     Median       2.5%      97.5%
MSY   3707.645  3739.2823  2881.6559  4392.0042
BMSY 43720.688 44150.8957 38192.7188 47675.5254
FMSY     0.085     0.0872     0.0618     0.0985
Umsy     0.075     0.0766     0.0549     0.0860
OFL   1904.030  1667.7685   645.6965  3696.2264

cmsy

set.seed(2019)
x <- x.blue_gren <- cmsy(
  yr=blue_gren$yr, 
  ct=blue_gren$ct, 
  start_r = c(0.015,0.2), 
  start_k = 2*c(max(blue_gren$ct), 10*max(blue_gren$ct)),
  reps=1e4)

par(mfrow = c(2, 2))
plot(blue_gren$yr, blue_gren$ct, type="o", xlab="Year", ylab="Catch (t)")
plot(blue_gren$yr, apply(x$biomass, 2, median)[-1], type="o", ylab="Estimated biomass", xlab="Year")
hist(x$bmsy)
plot(x$theta$r, x$theta$k, col = "#00000030")

Comparison

par(mfrow=c(2,2))

#### catchmsy
Y1 <- c(blue_gren$yr, tail(blue_gren$yr,1)+1)
Accepted <- (outpt$Values$like==1)
Est.accepted <- as.matrix(outpt$Values[Accepted,c(4,3,2)])

Bio.accepted <- apply(Est.accepted, 1, PDM.SC, Catch=blue_gren$ct)
Bio_qunantile <- apply(Bio.accepted, 1, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
Bio_qunantile <- data.frame(Year=Y1, Bio=t(Bio_qunantile))
colnames(Bio_qunantile) <-
  c("Year", "Bio_q2.5", "Bio_q25", "Bio_q50", "Bio_q75", "Bio_q97.5")
#headtail(Bio_qunantile, 5)

Bmsy.accepted <- matrix(rep(Est.accepted[,2]/2, length(Y1)), byrow=T, nrow=length(Y1))

bbmsy.accepted <- apply(Est.accepted, 1, PDM.SC, Catch=blue_gren$ct)/Bmsy.accepted
bbmsy_qunantile <- apply(bbmsy.accepted, 1, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
bbmsy_qunantile <- data.frame(Year=Y1, Bio=t(bbmsy_qunantile))
colnames(bbmsy_qunantile) <- 
  c("Year", "bbmsy_q2.5", "bbmsy_q25", "bbmsy_q50", "bbmsy_q75", "bbmsy_q97.5")
#headtail(bbmsy_qunantile, 5)

tmp1 <- ggplot(Bio_qunantile, aes(Year, Bio_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=Bio_q25, ymax=Bio_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=Bio_q2.5, ymax=Bio_q97.5), alpha = 0.1) 

tmp2 <- ggplot(bbmsy_qunantile, aes(Year, bbmsy_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=bbmsy_q25, ymax=bbmsy_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=bbmsy_q2.5, ymax=bbmsy_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

#### cmsy
Bio_qunantile <- apply(x$biomass, 2, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
Bio_qunantile <- data.frame(Year=Y1, Bio=t(Bio_qunantile))
colnames(Bio_qunantile) <-
  c("Year", "Bio_q2.5", "Bio_q25", "Bio_q50", "Bio_q75", "Bio_q97.5")
#headtail(Bio_qunantile, 5)

tmp3 <- ggplot(Bio_qunantile, aes(Year, Bio_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=Bio_q25, ymax=Bio_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=Bio_q2.5, ymax=Bio_q97.5), alpha = 0.1) 

tmp4 <- ggplot(x$bbmsy, aes(year, bbmsy_q50)) + geom_line()  +
    geom_ribbon(aes(ymin = bbmsy_q25, ymax = bbmsy_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin = bbmsy_q2.5, ymax = bbmsy_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

grid.arrange(tmp1, tmp2, tmp3, tmp4, nrow=2, ncol=2)

rm(outpt)
rm(x)
setwd(CurrentDir)

Indian Ocean kawakawa

NewDir <- paste(CurrentDir, "/DataLimit/kawakawa", sep="")
#dir.create(NewDir)
setwd(NewDir)

kawakawa <- read.csv("IOTC_kawakawa.csv")
headtail(kawakawa, 5)

   Year  Catch
1  1950   5574
2  1951   3253
3  1952   3285
4  1953   3243
5  1954   4495
64 2013 167348
65 2014 155750
66 2015 152906
67 2016 150743
68 2017 159752

summary(kawakawa$Catch)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3243   10087   38999   56292   89174  167348

catchmsy

Spec 1

l0 ~ U(0.8, 1.0)
lt ~ U(0.3, 0.7) k ~ U(maxC, 100*maxC)
r ~ U(0.01, 0.4)
M = 0.18

par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.kw1 <- catchmsy(
  year=kawakawa$Year,
  catch=kawakawa$Catch, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.8,up=1.0,step=0),
  lt=list(low=0.3,up=0.7,refyr=2018),
  sigv=0,
  k=list(dist="unif",low=max(kawakawa$Catch),up=20*max(kawakawa$Catch),mean=0,sd=0),
  r=list(dist="unif",low=0.01,up=0.4,mean=0,sd=0),
  M=list(dist="unif",low=0.18,up=0.18,mean=0.00,sd=0.00),
  nsims=nsims)

print(outpt$Initial)

          Distr  Lower   Upper Mean   SD
l0    increment    0.8       1 <NA> <NA>
k          unif 167348 3346960    0    0
r          unif   0.01     0.4    0    0
M          unif   0.18    0.18    0    0
lt         <NA>    0.3     0.7 <NA> <NA>
refyr      <NA>   2018    <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1)  2.5% (ll=1) 97.5% (ll=1)
k 2272450.232  2234293.8870 1328337.5691 3250623.0935
r       0.238        0.2299       0.0966       0.3942
M       0.180        0.1800       0.1800       0.1800

print(outpt$Estimates)

            Mean       Median        2.5%        97.5%
MSY   124523.913  122742.7265  71323.7251  170301.2517
BMSY 1136225.116 1117146.9435 664168.7846 1625311.5468
FMSY       0.119       0.1150      0.0483       0.1971
Umsy       0.102       0.0996      0.0432       0.1642
OFL   120304.951  113297.4717  44966.0781  198913.5488

cmsy

set.seed(2019)
x <- x.kawakawa <- cmsy(
  yr=kawakawa$Year, 
  ct=kawakawa$Catch, 
  start_r = c(0.015,0.1), 
  start_k = c(max(kawakawa$Catch), 20*max(kawakawa$Catch)),
  reps=1e4)

par(mfrow = c(2, 2))
plot(kawakawa$Year, kawakawa$Catch, type="o", xlab="Year", ylab="Catch (t)")
plot(kawakawa$Year, apply(x$biomass, 2, median)[-1], type="o", ylab="Estimated biomass", xlab="Year")
hist(x$bmsy)
plot(x$theta$r, x$theta$k, col = "#00000030")

Comparison

par(mfrow=c(2,2))

#### catchmsy
Y1 <- c(kawakawa$Year, tail(kawakawa$Year,1)+1)
Accepted <- (outpt$Values$like==1)
Est.accepted <- as.matrix(outpt$Values[Accepted,c(4,3,2)])

Bio.accepted <- apply(Est.accepted, 1, PDM.SC, Catch=kawakawa$Catch)
Bio_qunantile <- apply(Bio.accepted, 1, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
Bio_qunantile <- data.frame(Year=Y1, Bio=t(Bio_qunantile))
colnames(Bio_qunantile) <-
  c("Year", "Bio_q2.5", "Bio_q25", "Bio_q50", "Bio_q75", "Bio_q97.5")
#headtail(Bio_qunantile, 5)

Bmsy.accepted <- matrix(rep(Est.accepted[,2]/2, length(Y1)), byrow=T, nrow=length(Y1))

bbmsy.accepted <- apply(Est.accepted, 1, PDM.SC, Catch=kawakawa$Catch)/Bmsy.accepted
bbmsy_qunantile <- apply(bbmsy.accepted, 1, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
bbmsy_qunantile <- data.frame(Year=Y1, Bio=t(bbmsy_qunantile))
colnames(bbmsy_qunantile) <- 
  c("Year", "bbmsy_q2.5", "bbmsy_q25", "bbmsy_q50", "bbmsy_q75", "bbmsy_q97.5")
#headtail(bbmsy_qunantile, 5)

tmp1 <- ggplot(Bio_qunantile, aes(Year, Bio_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=Bio_q25, ymax=Bio_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=Bio_q2.5, ymax=Bio_q97.5), alpha = 0.1) 

tmp2 <- ggplot(bbmsy_qunantile, aes(Year, bbmsy_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=bbmsy_q25, ymax=bbmsy_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=bbmsy_q2.5, ymax=bbmsy_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

#### cmsy
Bio_qunantile <- apply(x$biomass, 2, quantile, prob=c(0.025, 0.25, 0.5, 0.75, 0.975))
Bio_qunantile <- data.frame(Year=Y1, Bio=t(Bio_qunantile))
colnames(Bio_qunantile) <-
  c("Year", "Bio_q2.5", "Bio_q25", "Bio_q50", "Bio_q75", "Bio_q97.5")
#headtail(Bio_qunantile, 5)

tmp3 <- ggplot(Bio_qunantile, aes(Year, Bio_q50)) + geom_line()  +
    geom_ribbon(aes(ymin=Bio_q25, ymax=Bio_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin=Bio_q2.5, ymax=Bio_q97.5), alpha = 0.1) 

tmp4 <- ggplot(x$bbmsy, aes(year, bbmsy_q50)) + geom_line()  +
    geom_ribbon(aes(ymin = bbmsy_q25, ymax = bbmsy_q75), alpha = 0.2) +
    geom_ribbon(aes(ymin = bbmsy_q2.5, ymax = bbmsy_q97.5), alpha = 0.1) +
    geom_hline(yintercept = 1, lty = 2) + theme_light()

grid.arrange(tmp1, tmp2, tmp3, tmp4, nrow=2, ncol=2)

setwd(CurrentDir)

Indian Ocean albacore

CurrentDir <- getwd()
NewDir <- paste(CurrentDir, "/DataLimit/albacore", sep="")
#dir.create(NewDir)
setwd(NewDir)

albacore <- read.csv("IOTC_ALB.csv")
headtail(albacore, 5)

   Year Catch
1  1950     6
2  1951     7
3  1952    68
4  1953  1095
5  1954  2831
57 2006 30114
58 2007 44479
59 2008 48148
60 2009 40480
61 2010 40000

summary(albacore$Catch)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      6   12434   19357   20970   30114   48148

Spec 1

l0 ~ U(0.8, 1.0)
lt ~ U(0.3, 0.7) in 2011
k ~ U(5maxC, 10maxC)
r ~ U(0.01, 0.4)
M = 0.18

par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.alb1 <- catchmsy(
  year=albacore$Year,
  catch=albacore$Catch, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.8,up=1.0,step=0),
  lt=list(low=0.3,up=0.7,refyr=2011),
  sigv=0,
  k=list(dist="unif",low=5*max(albacore$Catch),up=10*max(albacore$Catch),mean=0,sd=0),
  r=list(dist="unif",low=0.01,up=0.4,mean=0,sd=0),
  M=list(dist="unif",low=0.18,up=0.18,mean=0.00,sd=0.00),
  nsims=nsims)

print(outpt$Initial)

          Distr  Lower  Upper Mean   SD
l0    increment    0.8      1 <NA> <NA>
k          unif 240740 481480    0    0
r          unif   0.01    0.4    0    0
M          unif   0.18   0.18    0    0
lt         <NA>    0.3    0.7 <NA> <NA>
refyr      <NA>   2011   <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1) 2.5% (ll=1) 97.5% (ll=1)
k   410892.31   418524.4824 319202.0067  479507.9260
r        0.33        0.3397      0.2313       0.3968
M        0.18        0.1800      0.1800       0.1800

print(outpt$Estimates)

           Mean      Median        2.5%       97.5%
MSY   33647.621  32677.2793  26602.2970  44040.0946
BMSY 205446.156 209262.2412 159601.0033 239753.9630
FMSY      0.165      0.1699      0.1157      0.1984
Umsy      0.140      0.1433      0.1001      0.1652
OFL   29405.006  28489.9766  14494.2949  50620.7185

rm(outpt)

Spec 2

k ~ U(5maxC, 20maxC)
r ~ U(0.01, 0.6)

par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.alb2 <- catchmsy(
  year=albacore$Year,
  catch=albacore$Catch, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.8,up=1.0,step=0),
  lt=list(low=0.3,up=0.7,refyr=2011),
  sigv=0,
  k=list(dist="unif",low=5*max(albacore$Catch),up=20*max(albacore$Catch),mean=0,sd=0),
  r=list(dist="unif",low=0.1,up=0.6,mean=0,sd=0),
  M=list(dist="unif",low=0.18,up=0.18,mean=0.00,sd=0.00),
  nsims=nsims)

print(outpt$Initial)

          Distr  Lower  Upper Mean   SD
l0    increment    0.8      1 <NA> <NA>
k          unif 240740 962960    0    0
r          unif    0.1    0.6    0    0
M          unif   0.18   0.18    0    0
lt         <NA>    0.3    0.7 <NA> <NA>
refyr      <NA>   2011   <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1) 2.5% (ll=1) 97.5% (ll=1)
k  552419.289   527158.5659  263522.086  922614.8950
r       0.289        0.2639       0.108       0.5664
M       0.180        0.1800       0.180       0.1800

print(outpt$Estimates)

           Mean      Median        2.5%       97.5%
MSY   34226.303  34644.0486  22180.0530  44358.2753
BMSY 276209.644 263579.2830 131761.0430 461307.4475
FMSY      0.144      0.1320      0.0540      0.2832
Umsy      0.122      0.1134      0.0482      0.2267
OFL   33380.951  33990.9295  13633.6452  50892.6777

rm(outpt)

Spec 3

l0 ~ U(0.9, 1.0)
lt ~ U(0.2, 0.8) in 2011
k ~ U(2maxC, 10maxC)

par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.alb3 <-catchmsy(
  year=albacore$Year,
  catch=albacore$Catch, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.9,up=1.0,step=0),
  lt=list(low=0.2,up=0.8,refyr=2011),
  sigv=0,
  k=list(dist="unif",low=5*max(albacore$Catch),up=10*max(albacore$Catch),mean=0,sd=0),
  r=list(dist="unif",low=0.1,up=0.4,mean=0,sd=0),
  M=list(dist="unif",low=0.18,up=0.18,mean=0.00,sd=0.00),
  nsims=nsims)

print(outpt$Initial)

          Distr  Lower  Upper Mean   SD
l0    increment    0.9      1 <NA> <NA>
k          unif 240740 481480    0    0
r          unif    0.1    0.4    0    0
M          unif   0.18   0.18    0    0
lt         <NA>    0.2    0.8 <NA> <NA>
refyr      <NA>   2011   <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1) 2.5% (ll=1) 97.5% (ll=1)
k  410204.822   417572.7346 303335.3076  478909.3660
r       0.326        0.3356      0.2208       0.3962
M       0.180        0.1800      0.1800       0.1800

print(outpt$Estimates)

           Mean      Median        2.5%       97.5%
MSY   33121.288  32408.4650  25185.9618  44615.7517
BMSY 205102.411 208786.3673 151667.6538 239454.6830
FMSY      0.163      0.1678      0.1104      0.1981
Umsy      0.138      0.1417      0.0958      0.1650
OFL   27859.414  26897.0843   9681.2116  51673.6699

rm(outpt)

Spec 4

k ~ U(2maxC, 10maxC)
r ~ U(0.01, 0.6)
M = 0.3

par(mfrow=c(4,3), mar=c(4,4,2,2))

outpt <- outpt.alb4 <-catchmsy(
  year=albacore$Year,
  catch=albacore$Catch, 
  catchCV=NULL,
  catargs=list(dist="none",low=0,up=Inf,unit="MT"),
  l0=list(low=0.8,up=1.0,step=0),
  lt=list(low=0.3,up=0.7,refyr=2011),
  sigv=0,
  k=list(dist="unif",low=5*max(albacore$Catch),up=10*max(albacore$Catch),mean=0,sd=0),
  r=list(dist="unif",low=0.1,up=0.4,mean=0,sd=0),
  M=list(dist="unif",low=0.3,up=0.3,mean=0.00,sd=0.00),
  nsims=nsims)

print(outpt$Initial)

          Distr  Lower  Upper Mean   SD
l0    increment    0.8      1 <NA> <NA>
k          unif 240740 481480    0    0
r          unif    0.1    0.4    0    0
M          unif    0.3    0.3    0    0
lt         <NA>    0.3    0.7 <NA> <NA>
refyr      <NA>   2011   <NA> <NA> <NA>

print(outpt$Parameters)

  Mean (ll=1) Median (ll=1) 2.5% (ll=1) 97.5% (ll=1)
k   416288.52   420228.9659 325920.7899  477985.1768
r        0.33        0.3382      0.2349       0.3961
M        0.30        0.3000      0.3000       0.3000

print(outpt$Estimates)

           Mean      Median        2.5%       97.5%
MSY   34071.748  33316.5798  26923.8523  43214.3455
BMSY 208144.261 210114.4829 162960.3949 238992.5884
FMSY      0.165      0.1691      0.1175      0.1981
Umsy      0.132      0.1350      0.0960      0.1560
OFL   28900.940  27945.1198  14171.7147  46426.0078

rm(outpt)

setwd(CurrentDir)

Lecture 18. Wrap-up discussion on future workplan and collaboration

Colombo seminar on stock assessment and management (version 1)

Toshihide Kitakado (Tokyo University of Marine Science and Technology)

March 13-15, 2019

Preface

Acknowledgement

Information on the author

Preparation of R, Rstudio and Rmarkdown

R

Rstudio

Rmarkdown

Required libraries

Lecture 1. Overview of stock assessment and management

Lecture 2. How to use R via Rstudio

Getting R sessions started with basic arithmetics and handling the list

For loop

Monte Carlo simulaiton (for gambling, not an approriate example???)

Simple analysis for body proportion of horse mackerel

visual presentation

Quick presentation

More colourful visual presentation by “ggplot”

Analysis

Summary statistics

Example of one sample test (Null hypothesis, H0:mu=0.2)

Two sample test (if the mean is equal between two speciesor not)

Analysis of variance (test if all the mean are equal or not)

Pairwise t-test

Lecture 3. Several important biological parameters for stock assessment and management

Estimation of allometry formula in R

A simple regression analysis for estimating allometric curve

Diagnostics

Confidence intervals of curve and prediction

Lecture 4. Hands-on exercises for estimation of maturity ogive (GLM)

Generalized linear model (GLM)

Maturity analysis

Drawing logistic function

Statistical model and generation of example data

Statistical estimation of maturity curve using “glm” function

Confidence interval for the curve

Lecture 5. Hands-on exercises for estimation of growth formula (nonlinear regression)

Set up

Nonlinear regression (1) under an assumption of common parameters between female and male (including \(\sigma\))

Nonlinear regression (2) under an assumption of different parameters between female and male (except for \(\sigma\))

Comparison of models

Lecture 6. Stock assessment by depletion method

In Excel

In R

Lecture 7. Hands-on exercises for CPUE standardization

Set up

Data generation

Analysis by GLM (for simplicity by poisson assumption)

Lecture 8. Simple spatial mapping with GAM

Set up

GAM analysis

ggmap

ggplot

Lecture 9. Production model

Theoretical background in brief

Lecture 10. Hands-on exercises for production model using ML estimation

Reading the data

Visual presentation of data

Definition of population dynamics

Definition of likelihood function (z=1; Schaefer model)

Estimation of parameters by maximum likelihood (ML) methods

Visual presentation of outcomes (1)

Visual presentation of outcomes (2)

Lecture 11. Hands-on exercises for state-space production models

Installation of JAGS

Data for Indian Ocean albacore

JAGS run (using package “jagsUI”“)

Set-up

Preparation of a file for “model and estimation”

RUN

Results

Kobe plot

Lecture 12. Age-structured population dynamics

Lecture 13. Hands-on exercises for age-structured models

Installation of SS3

Installation of “r4ss” for graphical outputs of SS3 outcomes

Running “SS3” and then use “r4ss” (using an example “Simple”)

Filename.dat