Category: Uncategorized

  • Programming Language R

    Coding Exercise

    Plotting binomical distribution, scatter plot, box-plot, probability distribution, and Gaussian Curve

    Problem 1: Binomial Distribution

    n <- 60
    # for p=0.3, generating statistical summary
    p <- 0.3
    bin_dist_0.3 <- dbinom(0:n, n, p)
    
    summary(bin_dist_0.3)
    ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    ##    0.00    0.00    0.00    0.02    0.01    0.11
    sd(bin_dist_0.3)

    ## [1] 0.03

    # for p=0.5, generating statistical summary

    p <- 0.5
    bin_dist_0.5 <- dbinom(0:n, n, p)
    summary(bin_dist_0.5)

    ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    ##    0.00    0.00    0.00    0.02    0.01    0.10

    sd(bin_dist_0.5)

    ## [1] 0.03

    # for p=0.8, generating statistical summary

    p <- 0.8
    bin_dist_0.8 <- dbinom(0:n, n, p)
    summary(bin_dist_0.8)

    ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    ##    0.00    0.00    0.00    0.02    0.01    0.13

    sd(bin_dist_0.8)

    ## [1] 0.04

    preparing for the plot

    x <- seq(0,n,1)
    X <- c(x,x,x)
    bin_dist <- c(bin_dist_0.3,bin_dist_0.5,bin_dist_0.8)
    binomial.distribution <- data.frame(X,bin_dist)
    n <- as.numeric(rownames(binomial.distribution))
    binomial.distribution <- binomial.distribution %>%
      mutate(p = ifelse(n<61, "p=0.3", ifelse(n<122, "p=0.5", "p=0.8")))

    Plotting Binomial Distribution

    binomial.distribution %>%
      ggplot() +
      geom_line(aes(x=X, y=bin_dist, col=p)) +
      ggtitle("Binomial Distribution") +
      ylab("")

    Plotting Boxplot

    boxplot(bin_dist ~ p,
            data = binomial.distribution,
            names = c("p=0.3", "p=0.5", "p=0.8"))

    Problem 2: Relationship between waiting and duration of eruptions

    Getting data

    faithful.data <- as.data.frame(faithful)
    attach(faithful.data)

    Plotting a scatter plot waiting vs. duration of eruptions

    plot(x = eruptions,
         y = waiting,
         main = "waiting vs. duration",
         xlab = "duration")

    linear model fit

    model <- lm(waiting ~ eruptions)
    model
    ##
    ## Call:
    ## lm(formula = waiting ~ eruptions)
    ##
    ## Coefficients:
    ## (Intercept)    eruptions 
    ##        33.5         10.7

    linear association between duration of eruptions and waiting between eruptions

    plot(x = eruptions,
         y = waiting,
         main = "waiting vs. duration",
         xlab = "duration")
    abline(model,
           col = "red",
           lwd = 3)

    Problem 3: short vs. long eruptions

    detach(faithful.data)
    faithful.data <- faithful.data %>%
      mutate(type = ifelse(eruptions<3.1, "short", "long" ))

    boxplot(waiting ~ type,
            data = faithful.data,
            ylab = "waiting",
            main = "Waiting between eruptions: long vs. short eruptions")
    boxplot(eruptions ~ type,
            data = faithful.data,
            ylab = "Duration of eruptions",
            main = "Duration of each eruption: long vs. short eruptions")

    Problem 4: Uniform Probability Distribution

    generating random variable which follows uniform distribution

    n <- 10000
    min <- -1
    max <- 2
    parameter <- runif(n, min = min, max = max)

    plotting distribution of random variable generated above

    breaks <- seq(min, max, (max-min)/20)
    xrange <- range(-2,3)

    hist(parameter,
         breaks = breaks,
         right = FALSE,
         col = "light blue",
         xlim = xrange,
         main = "Distribution of random variable generated using runif")

    Relative cumulative frequency of uniformly distributed variable

    parameter.cut <- cut(parameter, breaks, right = FALSE)
    parameter.freq <- table(parameter.cut)
    parameter.relfreq <- parameter.freq/n
    parameter.cumfreq <- cumsum(parameter.freq)
    parameter.cumrelfreq <- parameter.cumfreq / n
    parameter.cumrelfreq.0 <- c(0, parameter.cumrelfreq)

    Plotting Relative Cumulative Distribution of uniformly distributed random variable

    plot(parameter.cumrelfreq)
    lines(parameter.cumrelfreq)

    When enough values of random variable are generated, the distribution starts resembling uniform distribution.

    Problem 5:

    Generating 100×40 matrix

    matrix <- replicate(40, runif(100,min,max))

    Preparation for plotting (x,y1) and (x,y2)

    y1 <- matrix[,1]
    y2 <- matrix[,2]

    y1.cut <- cut(y1, breaks, right = FALSE)
    y1.freq <- table(y1.cut)
    y1.relfreq <- y1.freq/100

    y1data <- data.frame(distribution = y1.relfreq, variable = "y1")
    y1data <- y1data %>% select(-distribution.y1.cut)

    parameterdata <- data.frame(distribution = parameter.relfreq, variable = "parameter")
    parameterdata <- parameterdata %>% select(-distribution.parameter.cut)

    y2.cut <- cut(y2, breaks, right = FALSE)
    y2.freq <- table(y2.cut)
    y2.relfreq <- y2.freq/100
    y2data <- data.frame(distribution = y2.relfreq, variable = "y2")
    y2data <- y2data %>% select(-distribution.y2.cut)

    data <- bind_rows(parameterdata,y1data,y2data)
    x <- seq(min, max, (max-min)/19)
    X <- c(x,x,x)
    X <- as.data.frame(X)
    data1 <- bind_cols(data, X)
    Plotting (x,y1)
    data1 %>%
      filter(variable == c("y1", "parameter")) %>%
      ggplot() +
      geom_line(aes(x = X , y = distribution.Freq, col = variable), size = 1) +
      ggtitle("Comparing Relative Distribution of y1 and parameter") +
      ylab("Relative Frequency") +
      xlab("")

    Plotting (x,y2)

    data1 %>%
      filter(variable == c("y2", "parameter")) %>%
      ggplot() +
      geom_line(aes(x = X , y = distribution.Freq, col = variable), size = 1) +
      ggtitle("Comparing Relative Distribution of y2 and parameter") +
      ylab("Relative Frequency") +
      xlab("")

    Problem 6: Gaussian Curve

    summation <- rowSums(matrix)
    matrix1 <- cbind(matrix, summation)
    normal <- matrix1[,41]
    hist(normal,
         main = "Plotting summation of Columns",
         col = "light blue")

    It approximates normal distribution

  • Robotic Process Automation (RPA)

    RPA is a software tool to integrate any application to automate routine, predictable tasks using structured digital data. RPA tool operates by deploying software script (also known as ‘bot’) to imitate human task within a business workflow. These bots act like a human inputting and consuming information from multiple IT systems.

    An RPA tool possess three core competencies – Low-code/no-code development environment for citizen developers to create bots, integration with enterprise applications through UI interaction, APIs, connectors and a orchestrator (a control dashboard) to manage (configuration and monitoring) bots. In addition, RPA tool may also provide – Mechanisms to align itself with planned changes in the enterprise application/system ecosystem, automated disaster recovery, support for various hosting options, data security and role-based privileges, In-built AI (Artificial Intelligence), ML(Machine Learning) and NLP (Natural language processing) capabilities, process mining and discovery capabilities

    RPA tools are used in mainly three areas – getting data into or between systems, consolidating data into reports or standardized formats and automating a structured, predetermined workflow or building a workflow. Financial institutes have been an early adopter of RPA. Many onerous back-office functions, such as ensuring an up-to-date Know Your Client (KYC) form is filed or a recent credit check is included on a loan application, are ideal for RPA. An RPA tool can be triggered manually or automatically, move, or populate data between prescribed locations, document audit trails, conduct calculations, perform actions, and trigger downstream activities.

    Why: Easy to develop bots provide speed to value, high ROI, accuracy, and easy integration

    Application leaders apply robotic process automation as a noninvasive integration method to automate routine, repetitive, predictable tasks to unlock tactical benefits. RPA is designed to play nice with most legacy applications, making it easier to implement compared to other enterprise automation solutions. Organizations can take agile approach for implementing RPA solutions as they are scalable, and work well with existing IT infrastructure. Bots are easy to create and integrate using no-code/low-code development environment. RPA streamlines the process and enables strategic analysis by providing event logs. RPA is the least expensive technology of all cognitive solutions and can be a catalyst to digitally transform the company by providing greater productivity, speed to value and high return on investment.

    How: Mine, identify and prioritize

    An organization can follow below process-flow to deploy RPA solutionFew major vendors for RPA tools are Microsoft Automate and UiPath.

    Recommendations: Bots are augmentation of human-workforce

    Financial institute, specially a wealth management organization can deploy RPA to accomplish client on-boarding, interactive what-if scenarios, and exception handling in trade processing. The organization can benefit by creating bots that can perform reconciliation by retrieving data in numerous forms from external parties and internal accounting/ recordkeeping systems, formatting information, comparing data sets, and making corrections and adjustments based on defined rules. Bots will be augmentation of human-workforce, carrying out tedious tasks much more quickly and accurately, while employees focus on high-value tasks.

    Example of RPA Opportunity

    References

    • Tornbohm, Cathy (2020). When and Where to Use Robotic Process Automation in Finance and Accounting (ID G00377790)

    https://www.gartner.com/document/3902070?ref=solrAll&refval=262289193

    https://www.fiserv.com/en/about-fiserv/resource-center/white-papers/how-digitalization-is-reshaping-wealth-management.html

    • illimity Bank (2020). Illimity Bank simplifies loan process and saves 15 hours a month with Microsoft Power Automate

    https://customers.microsoft.com/en-us/story/821782-illimity-bank-banking-power-automate

    • Standard Bank – South Africa (2018). The power of four: African bank embraces digitalization and increases efficiency with time-saving Microsoft Power Automate, Power Apps, Power BI, and SharePoint

    https://customers.microsoft.com/en-us/story/standard-bank-banking-capital-markets-powerapps

    • Naved Rashid (2020). Critical Capabilities for Robotic Process Automation (ID G00465756)

    https://www.gartner.com/document/3989821?ref=solrAll&refval=262291979

    • Saikat Ray (2020). Magic Quadrant for Robotic Process Automation

    https://www.gartner.com/document/3988021?ref=solrAll&refval=262291998

    • Davenport, T., Ronanki, R. (2018).  Artificial Intelligence in the Real World. Harvard Business Review.

    https://hbr.org/2018/01/artificial-intelligence-for-the-real-world

    • Porter,M., Heppelmann, J. (2015, October). How Smart, Connected Products Are Transforming Companies.

    https://hbr.org/2015/10/how-smart-connected-products-are-transforming-companies

    • Microsoft (2020) Get started with Power Automate

    https://docs.microsoft.com/en-us/power-automate/getting-started

    • Microsoft – Power Automate website

    https://flow.microsoft.com/en-us